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(54)1itle: MULTIVALENT ANTIGEN-BINDING PROTEINS ~ ~ ' 

(57) Abstract 

smbed^/S£eriV«'Sh^^ '° r^^*'"'' multivalent antigen-binding proteins are de- 

S nSL-nZ 1 ^ , f '""i"'*" punfication of compositions containing both monomeric and multivalent forms of 
nSn^. K °'°>«'="^«^ and production of multivalent proteins from purif.ed monomers. Production of muSen 

dStion%„^.SV «'"=«'^'f^t!«"-dePfndent association of monomeric proteins, or by rearrangement of regio„™vfng 
tt^rS ^ of different regions. Bivalent proteins, including homobivalent and heterobivalent pro 

2d^sricl^dl«lft'^r "'1"'"'=" ""'"S for bivalent single-chain antigen-binding proteins are ^s- 
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2. Description of the Background Art 

Antibodies are proteins generated by the immune system to provide a 
'specific molecule capable of complexing with an invading molecule, termed 
an antigen. Figure 14 shows the structure of a typical antibody molecule. 
5 Natural antibodies have two identical antigen-binding sites, both of which are 

specific to a particular antigen. The antibody molecule "recognizes" the 
antigen by complexing its antigen-binding sites with areas of the antigen 
termed epitopes. The epitopes fit into die conformational architecture of the 
antigen-binding sites of die antibody, enabling the antibody to bind to the 

10 antigen. 

The antibody molecule is composed of two identical heavy and two 
identical light polypeptide chains, held together by interchain disulfide bonds 
(see Hg. 14). The remainder of this discussion will refer only to one 
light/heavy pair of chains, as each light/heavy pair is identical. Each 

15 individual light and heavy chain folds into regions of approximately 110 amino 

acids, assuming a conserved tiiree-dimensional conformation. The light chain 
comprises one variable region (termed VJ and one constant region (Ci), while 
die heavy chain comprises one variable region (V„) and tiiree constant regions 
(0,1, Ch2 and C„3). Pairs of regions associate to form discrete structures as 

20 shown in Figure 14. In particular, die light and heavy chain variable regions, 

Vl and VH,associate to form an "Fv" area which contains the antigen-binding 
site. 

The variable regions of both heavy and light chains show considerable 
variability in structure and amino acid composition from one antibody 

25 molecule to anotiier, whereas tiie constant regions show little variability. The 

term "variable" as used in this specification refers to die diverse nature of die 
amino acid sequences of the antibody heavy and light chain variable regions. 
Each antibody recognizes and binds antigen tiirough the binding site defined 
by die association of die heavy and light Chain variable regions into an Fy 

30 area. The light-chain variable region and the heavy-chain variable region 

Vh of a particular antibody molecule have specific amino acid sequences diat 
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Another advantage of multivalent antigen-binding proteins is the ease 
with which they may be produced and engineered, as compared to the 
myeloma-fusing technique pioneered by Kohler and Milstein that is used to 
produce whole antibodies. 

Brief Description of the Drawings. 

The present invention as defined in the claims can be better understood 
with reference to the text and to the following drawings: 

FIG. lA is a schematic two-dimensional representation of two identical 
single-chain antigen-binding protein molecules, each comprising a variable 
light chain region (V^, a variable heavy chain region (V„), and a polypeptide 
linker joining the two regions. The single-chain antigen-binding protein 
molecules are shown binding antigen in their antigen-binding sites. 

FIG. IB depicts a hypothetical homodivalent antigen-binding protein 
formed by association of the polypeptide linkers of two monovalent single- 
chain antigen-binding proteins from Fig. lA (the Association model). The 
divalent antigen-binding protein is formed by. the concentration-driven 
association of two identical single-chain antigen-binding protein molecules. 

FIG. IC depicts the hypothetical divalent protein of FIG. IB with 
bound antigen molecules occupying both antigen-binding sites. 

FIG. 2A depicts the hypothetical homodivalent protein of Figure IB. 

FIG. 2B depicts tiiree single-chain antigen-binding protein molecules 
associated' in a hypothetical trimer. 

HG. 2C depicts a hypothetical tetramer of four single-chain antigen- 
binding protein molecules. 

FIG. 3A depicts two separtite and distinct monovalent single-^:hain 
antigen-bindingproteins.Anti-Asingle-chainantigen-bindingproteinandAnti- 
B single<:hain antigen-binding protein, with different antigen specificities, each 
individually binding either Antigen A or Antigen B. 
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allow the antigen-binding site to assume a conformation that binds to the 
antigen epitope recognized by that pariicular antibody. 

Within the variable regions are found regions in which the amino acid 
sequence. is extremely variable from one antibody to another. Three of these 
so-called "hypervariable" regions or "complementarity-determining regions" 
(CDR's) are found in each of the liglit and heavy chains. The three CDR's 
from a light chain and the three CDR's from a corresponding heavy chain 
form the antigen-binding site. 

Cleavage of the naturally-occurring antibody molecule with the 
proteolytic enzyme papain generates fragments which retain their antigen- 
binding site. These fragments, commonly known as Fab's (for Fragment, 
antigen binding site) are composed of the Q, Vl, C,,! and Vh regions of the 
antibody. In the Fab the, light chain and the fragment of the heavy chain are 
covalently linked by a disulfide linkage. 

Recent advances inimmunobiology, recombinant DNA technology, and 
computer science have allowed the creation of single polypeptide chain 
molecules that bind antigen. These single-chain antigen-binding molecules 
incorporate a linker polypeptide to bridge the individual variable regions, Vl 
and Vh, into a single polypeptide chain. A computer-assisted method for 
linker design is described more particularly in U.S. Patent No. 4,704,692, 
issued to Ladner et aL in November, 1987, and incorporated herein by 
reference. A description of the theory and production of single-chain antigen- 
binding proteins is found in U.S. Patent No. 4,946,778 (Ladner et aL), issued 
August 7, 1990, and incorporated herein by reference. The single-chain 
antigen-binding proteins produced under the process recited in U.S. Patent 
4,946,778 have binding specificity and affinity substantially similar to that of 
the corresponding Fab fragment. 

Bifimctional, or bispecific, antibodies have antigen binding sites of 
different specificities. Bispecific antibodies have been generated to deliver 
cells, cytotoxins, or drugs to specific sites. An important use has been to 
deliver host cytotoxic cells, such as natural killer or cytotoxic T cells, to 
specific cellular targets. (U.D. Staerz, 0. Kanagawa, M.J. Bevan, Nature 
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314:628 (1985); S. Songilvilai, P.J. Laclimann. Clin. Exp. Immunol. 79: 315 
(1990)). Another important u?e has been to deliver cytotoxic proteins to 
specific cellular targets. (V. Raso. T. Griffin, Cancer Res. 41:2073 (1981); 
"S. Honda. Y. Ichimori. S. Iwasa, O/totechnology 4:59 (1990)). Another 
5 important use has been to deliver anti-cancer non-protein drugs to specific 

cellular targets (J. Corvalan. W. Smitli, V. Gore, Intl. J. Cancer Suppl. 2:22 
(1988); M. Pimm etal, British J. of Cancer 61 :50S (1990)). Such bispecific 
antibodies have been prepared by chemical cross-linking (M. Brennan et al.. 
Science 229:81 (1985)), disulfide exchange, or ihe production of hybrid- 
ID hybridomas (quadromas). Quadromas are constructed by fusing hybridomas 
that secrete two different types of antibodies against two different antigens 
OECurokawa, T. etal.. Biotechnology 7:1163 (1989)). 

Summary of the Invention 

This invention relates to the discovery that multivalent forms of single- 

15 chain antigen-binding proteins have significant utility beyond that of the 

monovalent single-chain antigen-binding proteins. A multivalent antigen- 
binding protein has more than one antigen-binding site. Enhanced binding 
activity, di- and multi-specific binding, and other novel uses of multivalent 
antigen-binding proteins have been demonstrated or are envisioned here. 

20 Accordingly, the invention is directed to multivalent forms of single-chain 

antigen-binding proteins, compositions of multivalent and single-chain antigen- 
bindingproteins, methods of making and purifying multivalent forms of single- 
chain antigen-binding proteins, and uses for multivalent forms of single-chain 
antigen-binding proteins. The invention provides a multivalent antigen-binding 

25 protein comprising two or more single-chain protein molecules, each single- 

chain molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibodyheavy or light chain; and apep.tide linker linking the first and second 

30 polypeptides into a single-chain protein. 
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Also provided is a composition comprising a multivalent antigen- 
binding protein substantially free of single-chain molecules. 

Also provided is an aqueous composition comprising an excess of 
multivalent antigen-binding protein over single-chain molecules. 

A method of producing a multivalent antigen-binding protein is 
provided, comprising the steps of producing a composition comprising 
multivalent antigen-binding protein and single-chain molecules, each single- 
chain molecule comprising a first polypeptide comprising the binding portion 
of the variable region of an antibody heavy or light chain; a second 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; and a peptide linker linking the first and second 
polyp^tides into a single-chain molecule; separating the multivalent protein 
from the single-chain molecules; and recovering the multivalent protein. 

Also provided is a method of producing multivalent antigen-binding 
protein, comprising the steps of producing a composition comprising single- 
chain molecules as previously defined; dissociating the single-chain molecules; 
reassociating the single-chain molecules; separating the resulting multivalent 
antigen-binding proteins from the single-chain molecules; and recovering the 
multivalent proteins. 

Also provided is another method of producing a multivalent antigen- 
binding protein, comprising the step of chemically cross-linking at least two 
single-chain antigen-binding molecules. 

. Also provided is another method of producing a multivalent antigen- 
binding protein, comprising the steps of producing a composition comprising 
single-chain molecules as previously defined; concentrating said single-chain 
molecules; separating said multivalent protein from said single-chain 
molecules; and finally recovering said multivalent protein. 

Also provided is another method of producing a multivalent antigen- 
binding protein comprising two or more single-chain molecules, each single- 
chain molecule as previously defined, said method comprising: providing a 
genetic sequence coding for said single-chain molecule; transforming a host 
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cell or cells with said sequence; expressing said sequence in said host or hosts; 
and recovering said multivalent protein. 

Another aspect of the invention includes a method of detecting an 
'antigen in or suspected of being in a sample, which comprises contacting said 
5 sample with the multivalent antigen-binding protein of claim 1 and detecting 

whether said multivalent antigen-binding protein has bound to said antigen. 

Another aspect of tlie invention includes a method of imaging the 
internal structure of an animal, comprising administering to said animal an 
effective amount of a labeled form of the multivalent antigen-binding protein 
10 of claim 1 and measuring detectable radiation associated wiUi said animal. 

Another aspect of the invention includes a composition comprising an 
association of a multivalent antigen-binding protein with a therapeutically or 
diagnostically effective agent. 

Another aspect of this invention is a single-chain protein comprising: 
15 a first polypeptide comprising the bindingportion of the variable region, of an 

antibody light chain; a second polypeptide comprising the binding portion of 
the variable region of an antibody light chain; a peptide linker linking said first 
and second polypeptides (a) and (b) into said single-chain protein. 

Another aspect of the present invention includes the genetic 
20 constructions encoding the combinations of regions Vl-Vl and Vh-Vh for 

single-chain molecules, and encoding multivalent antigen-binding proteins. 

Another part of this invention is a multivalent single-chain antigen- 
binding protein comprising: a first polypeptide comprising the bindingportion 
of the variable region of an antibody heavy or light chain; a second 
25 polypeptide comprising the binding portion of the variable region of an 

antibody heavy or light chain; a peptide linker linking said first and second 
polypeptides (a) and (b) into said multivalent protein; a third polypeptide 
comprising the binding portion of die variable region of an antibody heavy or 
lightchain; afourth polypeptide comprising the binding portion of the variable 
30 region of an antibody heavy or light chain; a peptide linker linking said third 

and fourth polypeptides (d) and (e) into said multivalent protein; and a peptide 
linker linking said second and diird polypeptides (b) and (d) into said 
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multivalent protein. Also included are gentle constructions coding for this 
multivalent single-chain antigen-binding protein. 

Also included are replicable cloning or expression vehicles including 
plasmids, hosts transformed with the aforementioned genetic sequences, and 
methods of producing multivalent proteins with the sequences, transformed 
hosts, and expression vehicles. 

Methods of use are provided, such as a method of using the multivalent 
antigen-binding protein to diagnose a medical condition; a method of using the 
multivalent protein as a carrier to image the specific bodily organs of an 
animal; a therapeutic method of using the multivalent protein to treat a medical 
condition; and an immunotherapeutic method of conjugating a multivalent 
protein with a therapeutically or diagnostically effective agent. Also included 
are labelled multivalent proteins, improved immunoassays using them, and 
improved immunoaffinity purifications. 

An advantage of using multivalent antigen-binding proteins instead of 
single-chain antigen-binding molecules or Fab fragments lies in the enhanced 
binding ability of the multivalent form. Enhanced binding occurs because the 
multivalent form has more binding sites per molecule. Another advantage of 
the present invention is the ability to use multivalent antigen-binding proteins 
as multi-specific binding molecules. 

An advantage of using multivalent antigen-binding proteins instead of 
whole antibodies, is the enhanced clearing of the multivalent antigen-binding 
proteins from the serum due to their smaller size as compared to whole 
antibodies which may afford lower background in imaging applications. 
Multivalent antigen-binding proteins may penetrate solid tumors better than 
monoclonals, resulting in better tumor-fighting ability. Also, because they are 
smaller and lack the Fc component of intact antibodies, the multivalent 
antigen-binding proteins of the present invention may be less immunogenic 
than whole antibodies. The Fc component of whole antibodies also contains 
binding sites for liver, spleen and certain other cells and its absence should 
thus reduce accumulation in non-target tissues. 
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FIG. 3B depicts a hypothetical bispecific hcierodi valent antigen-binding 
protein formed from the single-chain antigen-binding proteins of Fig. 3A 
according to the Association model. 

FIG. 3C depicts the hypothetical heierodivaient antigen-binding protein 
of FIG. 3B binding bispecifically, i.e., binding the two different antigens, A 
and B. 

FIG. 4A depicts two identical sinrje-chnin antigen-binding protein 
molecules, each having a variable light chain region (VJ, a variable heavy 
chain region (V^), and a polypeptide linker joining the two regions. The 
single-chain antigen-binding protein molecules are shown binding identical 
antigen molecules in their antigen-binding sites. 

FIG. 4B depicts a hypothetical homodivalent protein formed by the 
rearrangement of the Vl and Vj, regions shown in FIG. 4A (the 
Rearrangement model). Also shown is bound antigen. 

FIG. 5A depicts two single-chain protein molecules, the first having an 
anti-B Vl and an anti-A V,j, and the second having an anti-A and an anti-B 
Vji. The figure shows the non-complementary nature of the Vl and Vh 
regions in each single-chain protein molecule. 

FIG. 5B shows a hypothetical bispecific heterodivalent antigen-binding 
protein formed by rearrangement of the two single-chain proteins of Figure 
5A. 

FIG. 5C depicts the hypothetical heterodivalent antigen-binding protein 
of FIG. 5B with different antigens A and B occupying their respective antigen- 
binding sites. 

FIG- 6A is a schematic depiction of a hypothetical trivalent antigen- 
binding protein according to the Rearrangement model. /' 

FIG. 6B is a schematic depiction of a hypothetical tetravalent antigen- 
binding protein according to the Rearrangement model. 

FIG. 7 is a chromatogram depicting the separation of CC49/212 
antigen-binding protein monomer from dimer on a cation exchange high 
performance liquid chromatographic column. The column is a PolyCAT A 
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aspartic acid column (Poly WC, Columbia, MD). Monomer is shown as Peak 
1, eluting at 27.32 min., and dimer is shown as Peak 2, eluting at 55.52 min. 

FIG. 8 is a chromatogram of the purified monomer from Fig. 7. 
'Monomer elutes at 21.94 min., preceded by dimer (20.135 min.) and trimer 
5 (18.640 min.). Gel filtration column, Protein-Pak 300SW (Waters Associates. 

Milford, MA). 

FIG. 9 is a similar chromatograir, of purified dimer (20.14 min.) from 
Fig. 7, run on the gel filtration HPLC column of Fig. 8. 

HG. iOA is an amino acid (SEQ ID NO. 11) and nucleotide (SEQ ID 
10 NO. 10) sequence of the single-chain protein comprising the 4-4-20 Vl region 

connected through the 212 linker polypeptide to the CC49 Vh region. - 

FIG. lOB is an amino acid (SEQ ID NO. 13) and nucleotide (SEQ ID 
NO. 12) sequence of the single-chain protein comprising the CC49 Vl region 
connected through the 212 linker polypeptide to the 4-4-20 Vh region. 
j5 FIG. 11 is a chromatogram depicting the separation of the monomer 

(27.83 min.) and dimer (50.47 min.) forms of the CC49/212 antigen-binding 
protein by cation exchange, on a PolyCAT A cation exchange column (Poly 

LC, Columbia, MD). 

Fig.. 12 shows the separation of monomer (17.65 min.), dimer (15.79 
20 min.), trimer (14.19 min.), and higher ohgomers (shoulder at about 13.09 

■ min.) of the B6.2/212 antigen-binding protein. This separation depicts the 
results of a 24-hour treatment of a 1.0 mg/ml B6.2/212 single-chain antigen- 
binding protein sample. A TSK G2000SW gel filtration HPLC column was 
used, Toyo Soda, Tokyo, Japan. 
25 Fig. 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 

CC49/212 antigen-binding protein sample, generating monomer, dimer, and 
trimer at 16.91. 14.9, and 13.42 min., respectively. The same TSK gel 
filtration column was used as in Fig. 12. 

Hg. 14 shows a schematic view of the four-chain structure of a human 

30 IgG molecule. 
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Fig. 15A is an amino acid (SEQ ID NO. 15) and nucleotide (SEQ ID 
NO. 14) sequence of the 4-4-20/2,12 single-chain antigen-binding protein with 
a single cysteine hinge. 

Fig. 15B is an amino acid (SEQ ID NO. 17) and nucleotide (SEQ. ID 
5 NO. 16) sequence of the 4-4-20/212 single-chain antigen-binding protein with 

the two-cysteine hinge. 

Fig. 16 shows the amino acid (SEQ ID NO. 19) and nucleotide (SEQ 
ID NO. 18) sequence of a di\ a]ent CC49/212 single-chain antigen-binding 
protein, 

10 Fig. 17 shows the expression of the divalent CC49/212 single-chain 

antigen^biiiding protein of Fig. 16 at 42'' C, on an SDS-PAGE gel containing 
total E. coli protein. Lane 1 contains the molecular weight standards. Lane 
2 is the uninduced E. coli production strain grown at SCC. Lane 3 is divalent 
CC49/212 single-chain antigen-binding protein induced by growth at 42'*C. 

15 The arrow shows the band of expressed divalent CC49/212 single-chain 

antigen-binding protein. 

Fig. 18 is a graphical representation of four competition 
radioimmunoassays (RIA) in which unlabeled CC49 IgG (open circles) 
CC49/212 single-chain antigen-binding protein (closed circles) and CC49/212 

20 divalent antigen-binding protein (closed squares) and anti-fluorescein 4-4- 

20/212 single-chain antigen-binding protein (open squares) competed against 
a CC49 IgG radiolabeled with for binding to the TAG-72 antigen on a 
human breast carcinoma extract. 

Figure 19A is an amino acid (SEQ ID NO. 21) and nucleotide (SEQ 

25 ID NO. 20) sequence of the single-chain polypeptide comprising the 4-4-20 Vl 

region connected through the 217 linker polypeptide to the CC49 Vh region. 

Figure 19B is an amino acid (SEQ ID NO. 23) and nucleotide (SEQ 
ID NO. 22) sequence of the single-chain polypeptide comprising the CC49 Vl 
region connected through the 217 linker pol>T)eptide to the 4-4-20 Vh region. 

30 Figure 20 is a chromatogram depicting the purification of CC49/4-4-20 

heterodimer Fv ona canon exchange high performance liquid chromatographic 
column. The column is a PolyCAT A aspartic acid column (Poly LC, 



Ml 
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Columbia. MD). The hetcrodimer Fv is shown as peak 5. eluting at 30.10 
min. 

Figure 21 is a coomassie-blue stained 4-20% SDS-PAGE gel showing 
'the proteins separated in Figure 20. Lane 1 contains the molecular weight 
5 standards. Lane 3 contains the starting material before separation. Lanes 4-8 

contain factions 2. 3, 5, 6 and 7 respectively. Lane 9 contains purified 
CC49/212. 

Figure 22A is a chromatogram used to detennine the molecular size of 
fraction 2 from Figure 20. A TSK G3000SW gel filtration HPLC column was 
10 used (Toyo Soda, Tokyo, Japan). 

Figure 22B is a chromatogram used to detennine the molecular size of 
fraction 5 from Figure 20. A TSK G3000SW gel filtration HPLC column was 
used (Toyo Soda, Tokyo, Japan). 

Figure 22C is a chromatogram used to determine the molecular size of 
15 faction 6 from Figure 20. A TSK G30O05W gel filtration HPLC column was 

used (Toyo Soda, Tokyo, Japan). 

Figure 23 shows a Scatchnrd analysis of the fluorescein binding affiniQf 
of the CC49 4-4-20 heterodimer Fv (fraction 5 in Figure 20). 

Figure 24 is a graphical representation of three competition enzyme- 
20 linked immunosorbent assays (ELISA) in which unlabeled CC49 4-4-20 Fv 

(closed squares) CC49/212 single-chain Fv (open squares) and MOPC-21 IgG 
(+) competed against a biotin-labeled CC49 IgG for binding to the TAG-72 
antigen on a human breast carcinoma extract. MOPC-21 is a control antibody 
fliat does not bind to TAG-72 antigen. 
25 Figure 25 shows a coomassie-blue stained non-reducing 4-20% SDS- 

PAGE geL Lanes 1 and 9 contain the molecular weight standards. Lane 3 
contains the 4-4-20/212 CPPC single-chain antigen-binding protein affer 
purification. Lane 4, 5 and 6 contain the 4-4-20/212 CPPC single-chain 
antigen-binding protein after treatment with DTT and air oxidation. Lane 7 
30 contains 4-4-20/212 single-chain antigen-binding protein. 

Figure 26 shows a coomassie-blue stained reducing 4-20% SDS-PAGE 
gel (samples were treated with |3-mercaptoethanol prior to being loaded on the 
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gel). Lanes 1 and 8 contain the molecular weight standards. Lane 3 contains 
the 4-4-20/212 CPPC single-chain antigen-binding protein after treatment with 
W5-maleimidehexane. Lane 5 contains peak 1 of to-maleimidehexane treated 
4-4-20/212 CPCC single-chain antigen-binding protein. Lane 6 contains peak 
5 3 of to-maleimidehexane treated 4-4-20/212 CPPC single-chain antigen- 

binding protein. 

Detailed Description of the Preferred Embodiments 

This invendon relates to the discovery that multivalent forms of single- 
chain antigen-binding proteins have sigiiificant utility beyond that of the 

10 monovalent single-chain antigen-binding proteins. A multivalent antigen- 

binding protein has more than one antigen-binding site. For the purposes of 
this application, "valent" refers to the numerosity of antigen binding sites. 
Thus, a bivalent protein refers to a protein with two binding sites. Enhanced 
binding activity, bi- and multi-specific binding, and other novel uses of 

15 multivalent antigen-binding proteins have been demonstrated or are envisioned 

here. Accordingly, the invention is directed to multivalent forms of single- 
chain antigen-binding proteins, compositions of multivalent and single-chain 
antigen-binding proteins, methods of making and purifying multivalent forms 
of single-chain anrigen-binding proteins, and new and improved uses for 

20 multivalent forms of single-chain antigen-binding proteins. The invention 

provides a multivalent antigen-binding protein comprising two or more single- 
chain protein molecules, each single-chain molecule comprising a first 
polypeptide comprising the binding portion of the variable region of an 
antibody heavy or light chain; a second polypeptide comprising the binding 

25 portion of the variable region of an antibody heavy or light chain; and a 

peptide linker linking the first and second polypeptides into a single-chain 
protein. 

The term "multivalent" means any assemblage, covalently or non- 
covalently joined, of two or more single-chain proteins, the assemblage having 
30 more than one antigen-binding site. The single-chain proteins composing the 
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assemblage may have antigen-binding activity, or tiiey may lack antigen- 
binding activity individually but be capable of assembly into active multivalent 
antigen-binding proteins. The term ' multivalent" encompasses bivalent, 
'trivalent, tetravalent. etc. It is envisioned that multivalent forms above 

5 bivalent may be useful for certain applications. 

A preferred form of the multivalent antigen-binding protein comprises 
bivalent proteins, including heterobivalent and homobivalent forms. The term 
"bivalent" means an assemblage of single-chain proteins associated with each 
other to form two antigen-binding sites. The term "heterobivalent" indicates 

10 multivalent antigen-binding proteins that are bispecific molecules capable of 

binding to two different antigenic determinants. Therefore, heterobivalent 
proteins have two antigen-binding sites that have different binding specificities. 
■ The term "homobivalent" indicates that the two binding sites are for the same 
antigenic determinant. 

15 The terms "single-chain molecule" or "single-chain protein" are used 

interchangeably here. They are structurally defined as comprising the binding 
portion of a first polypeptide from the variable region of an antibody, 
associated witiv the binding portion of a second polypeptide from tiie variable 
region of an antibody, the two polypeptides being joined by a peptide linker 

20 linking the first and second polypeptides into a single polypeptide chain. The 

single polypeptide chain tiius comprises a pair of variable regions connected 
by a polypeptide linker. The regions may associate to form a functional 
antigen-binding site, as in the case wherein the regions comprise alight-chain 
and a heavy-chain variable region pair with appropriately paired 

25 " complementarity determining regions (CDRs). In this case, tiie single-chain 
protein is referred to as a "single-chain antigen-binding protein" or "single^ 
chain antigen-binding molecule." 

Alternatively, the variable regions may have unnamrally paired CDRs 
or may both be derived from the same kind of antibody chain, either heavy or 

30 light, in which case the resulting single-chain molecule may not display a 

fiinctional antigen-binding site. The single-chain antigen-binding protein 
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molecule is more fully described in U.S. Patent No. 4,946,778 (Ladner^/ a/.), 
and incorporated herein by reference. 

Without being bound by any particular theory, the inventors speculate 
on several models which can equally explain the phenomenon of multivalence. 
The inventors* models are presented herein for the purpose of illustration only, 
and are not to be construed as liniitiuions upon the scope of the invention. 
The invention is useful and operable regardless of the precise mechanism of 
multivalence. 

Figure 1 depicts the first hypothetical model for the creation of a 
multivalent protein, the "Association" model. Fig. 1 A shows two monovalent 
single-chain antigen-binding proteins, each composed of a Vl, a V^, and a 
linker polypeptide covalently bridging the two. Each monovalent single-chain 
antigen-binding protein is depicted having an identical antigen-binding site 
containing antigen. Figure IB shows the simple association of the two single- 
chain antigen-binding proteins to create the bivalent form of the multivalent 
protein. It is hypothesized that simple hydrophobic forces between the 
monovalent proteins are responsible for their association in this manner. The 
origin of the multivalent proteins may be traceable to their concentration 
dependence. The monovalent units retain their original association between 
the Vh and regions. Figure IC shows the newly-formed homobivalent 
protein binding two identical antigen molecules simultaneously. Homobivalent 
antigen-binding proteins are necessarily monospecific for antigen. 

Homovalent proteins are depicted in Figs. 2A through 2C formed 
according to the Association model. Fig. 1 A depicts a homobivalent protein. 
Fig. 2B a trivalent protein, and Fig. 2C a tetravalent protein. Of course, the 
limitations of two-dimensional images of three-dimensional objects must be 
taken into account. Thus, the actual spatial arrangement of multivalent 
proteins can be expected to vary somewhat from these figures. 

A heterobivalent antigen-binding protein has two different binding sites, 
the sites having different binding specificities. Figures 3 A through C depict 
the Association model pathway to the creation of a heterobivalent protein. 
Figure 3A shows two monovalent single-chain antigen-binding proteins. Ami- 
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A single-chainantigen-binding protein and Anti-B single-chain antigen-binding 
protein, with antigen types A and B occupying the respective binding sites. 
Figure 3B depicts the heterobivaient protein formed by the simple association 
of tlie original monovalent proteins. Figure 3C shows the heterobivaient 

5 protein having bound antigens A and B into the antigen-binding sites. Figure 

3C therefore shows the heterobivaient protein binding in a bispecific manner. 

An alternative model for the formation of multivalent antigen-binding 
proteins is shown in Figures 4 through 6. This "Rearrangement" model 
hypothesizes the dissociation of the variable region interface by contact with 

10 dissociating agents such as guanidine hydrochloride, urea, or alcohols such as 

ethanol. either alone or in combination. Combinations and relevant 
concentration ranges of dissociating agents are recited in the discussion 
concerning dissociating agents, and in Example 2. Subsequent re-association 
of dissociated regions allows variable region recombination differing from the 

15 starting single-chain proteins, as depicted in Fig. 4B. The homobivalent 

antigen-binding protein of Figure 4B is formed from the parent single-chdn 
antigen-binding proteins shown in Figure 4A, the recombined bivalent protein 
having Vl and V„ from the parent monovalent single-chain proteins. The 
homobivalent protein of Figure 4B is a fully functional monospecific bivalent 

20 protein, shown actively binding two antigen molecules. 

Figures 5A-5C show the formation of heterobivaient antigen-binding 
proteins via the Rearrangement model. Figure 5A shows a pair of single- 
chain proteins, each having a V, widi complementarity determining regions 
(CDRs) tiiat do not match those of the associated V„. These single-chain 

25 proteins have reduced or no ability to hind antigen because of the mixed 

nature of tiieir antigen-binding sites, and thus are made specifically to be 
assembled into multivalent proteins through this route. Figure 5B shows the 
heterobivaient antigen-binding protein formed whereby the V„ and regions ■ 
of the parent proteins are shared bet%vecn the separate halves of the 

30 heterobivaient protein. Figure 5C shows the binding of two different antigen 

molecules to tiie resultant functional bispecific heterobivaient protein. The 
Rearrangement model also explains the generation of multivalent proteins of 
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a higher order than bivalent, as it can be appreciated that more than a pair of 
single-chain proteins can be reassembled in this manner. These are depicted 
in Figures 6 A and 6B. 

One of the major utilities of the multivalent antigen-binding protein is 

5 in the heterobivalent form, in which one specificity is for one type of hapten 

or antigen, and the second specificity is for a second type of hapten or 
antigen. A muUivalent molecule having two distinct binding specificities has 
many potential uses. For instance, one antigen binding site may be specific 
for a cell-surface epitope of a target cell, such as a tumor cell or other 

10 undesirable cell. The other antigen-binding site may be specific for a cell- 

surface epitope of an effector cell, such as the CD3 protein of a cytotoxic, T- 
celL In this way, the heterobivalent antigen-binding protein may guide a 
cytotoxic cell to a panicular class of cells that are to be preferentially 
attacked. 

15 Other uses of heterobivalent antigen-binding proteins are the specific 

targeting and destruction of blood clots by a bispecific molecule with 
specificity for tissue plasminogen activator (tPA) and fibrin; the specific 
targeting of pro-drug activating enzymes to tumor cells by a bispecific 
molecule with specificity for tumor cells and enzyme; and specific targeting 

20 of cytotoxic proteins to tumor cells by a bispecific molecule with specificity 

for tumor cells and a cytotoxic protein. This list is illustrative only, and any 
use for which a multivalent specificity is appropriate comes within die scope 
of tiiis invention. 

The invention also extends to uses for the multivalent antigen-binding 
25 proteins in purification and biosensors. Affinity purification is made possible 

by affixing the multivalent antigen-binding protein to a support, witii the 
antigen-binding sites exposed to and in contnct with the ligand molecule to be 
separated, and thus purified. Biosensors generate a detectable signal upon 
binding of a specific antigen to an antigen-binding molecule, with subsequent 
30 processing of the signal. MuUivalent antigen-binding proteins, when used as 

the antigen-binding molecule in biosensors, may change conformation upon 
binding, tiius generating a signal that may be detected. 
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Essentially all of the uses for which monoclonal or polyclonal 
antibodies, or fragments thereof, have been envisioned by the prior art, can 
be addressed by the multivalent proteins of the presem invention. These uses 
'include detectably-labelled forms of the multivalent protein. Types of labels 
5 are well-known to those of ordinary skill in the art. They include 

radiolabelling, chemiluminescent labeling, fluorochromic labelling, and 
chromophoric labeling. Other uses include imaging the internal structure of 
an animal (including a human) by administering an effective amount of a 
labelled form of the multivalent protein and measuring detectable radiation 

10 associated with the animal. They also include improved immunoassays. 

including sandwich immunoassay, competitive immunoassay, and odier 
immunoassays wherein the labelled antibody can be replaced by die 
- multivalent antigen-binding protein of this invention. 

A first preferred method of producing multivalent antigen-binding 

15 proteins involves separating the multivalent proteins from a production 

composition that comprises both multivalent and single-chain proteins, as 
represented in Example I. The method comprises producing a composition 
of multivalent and single-chain proieins, separating the multivalent proteins 
from the single-chain proteins, and recovering the multivalent proteins. 

20 • A second preferred method of producing multivalent antigen-binding 

proteins comprises the steps of producing single-chain protein molecules, 
dissociating said single-chain molecules, reass.Dciating the single-chain 
molecules such that a significant fraction of the resulting composition includes 
" multivalent forms of the single-chain antigen-binding proteins, separating 

25 multivalent antigen-binding proteins from single-chain molecules, and 

recovering the multivalent proteins. This process is illustrated with more 
detail in Example 2. For the purposes of this method, the term "producing a 
composition comprising single-chain molecules" may indicate the actual 
production of these molecules. The term may also include procuring them 

30 from whatever commercial or institutional source makes them available. Use 

of the term "producing single-chain proteins" means production of single-chain 
proteins by any process, but preferably according to the process set forth in 
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U.S. Patent No. 4.946,778 (Ladner ei ciL), Brielly, that patent pertains to a 
single polypeptide chain antigen-binding molecule which has binding 
specificity and affinity substantially similar to the binding specificity and 
affinity of the aggregate light and heavy chain variable regions of an antibody, 
to genetic sequences coding therefore, and to recombinant DNA methods of 
producing such molecules, and uses for such molecules. The single-chain 
protein produced by the I.adner et a!, methodology comprises two regions 
linked by a linker polypeptide. Tiic two regions are termed the Vh and Vl 
regions, each region comprising one half of a functional antigen-binding site. 

The term "dissociating said single-chain molecules" means to cause the 
physical separation of the two variable regions of the single-chain protein 
without causing denaturation of the variable regions. 

"Dissociating agents" are defined herein to include all agents capable 
of dissociating the variable regions, as defined above. In the context of this 
invention, the term includes the well-known agents alcohol (including ethanol), 
guanidine hydrochloride (GuHCl), and urea. Others will be apparent to those 
of ordinary skill in the art, including detergents and similar agents capable of 
interrupting the interactions that maintain protein conformation. In the 
preferred embodiment, a combination of GuHCl and ethanol (EtOH) is used 
as the dissociating agent. A preferred range for ethanol and GuHCl is from 
0 to 50% EtOH, vol/vol, 0 to 2.0 moles per liter (M) GuHCl. A more 
preferred range is from 10-30% EtOH and 0.5-1.0 M GuHCl, and a most 
preferred range is 20% EtOH, 0.5 M GiiHCl. A preferred dissociation buffer 
contains 0.5 M guanidine hydrochloride, 20% ethanol, 0.05 M TRIS, and 
0.01 M CaClj, pH 8.0. 

Use of the term "re-associating said single-chain molecules" is meant 
to describe the reassociation of the variable regions by contacting them with 
a buffer solution that allows reassociation. Such a buffer is preferably used 
in the present invention and is characicrizcd as being composed of 0.04 M 
MOPS, O.IO M calcium acetate, pH 7.5. Other buffers allowing the 
reassociation of the and regions are well within the expertise of one of 
ordinary skill in the art. 



wo 93/11161 



PCr/US92/09965 



-20 - 

The separation of the multivalent protein from the single-chain 
molecules occurs by use of standard techniques known in the art. particularly 
including cation exchange or gel filtration chromatography. 

Cation exchange chromatography is the general liquid chromatographic 
5 techniqueof ion-exchangechromatography utilizing anion columns well-known 

to those of ordinary skill in the art. lii this invention, the cations exchanged 
are the single-chain and multivalent protein molecules. Since multivalent 
proteins will have some multiple of the net charge of the single-chain 
molecule, the multivalent proteins are retained more strongly and are thus 

10 separated from the single-chain molea-les. The preferred cationic exchanger 

of the present invention is a polyaspartic acid column, as shown in Figure 7. 
Figure 7 depicts the separation of single-chain protein (Peak 1, 27.32 min.) 
from bivalent protein (Peak 2, 55.54 min.) Those of ordinary skill in the art 
will realize that the invention is not limited to any particular type of 

15 chromatography column, so lor.g as it is capable of separating the two forms 

of protein molecules. 

Gel filtration chromatography is the use of a gel-like material to 
separate proteins on the basis of their molecular weight. A "gel" is a matrix 
6f water and a polymer, such as agarose or polymerized acrylamide. The 

20 present invention encompasses the use of gel filtration HPLC (high 

performance liquid chromatography), as will be appreciated by one of ordinary 
skill in the art. Figure 8 is a chromatogram depicting the use of a Waters 
Associates' Protein-Pak 300 SW gel filtration column to separate monovalent 
single-chain protein from multivalent protein, including the monomer (21.940 

25 min.). bivalent protein (20. 135 min.), and trivalent protein (18.640 min.). 

Recovering the multivalent antigen-binding proteins is accomplished by 
standard collection procedures well known in the chemical and biochemical 
arts. In the context of die present invention recovering the multivalent protein 
preferably comprises collection of eluate fractions, containing the peak of 

30 interest from either the cation exchange column,, or the gel filtration HPLC 

column. Manual and automated fraction collection are well-known to one of 
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ordinary skill in the art. Subsequent processing may involve lyophilization of 
the eluate to produce a stable solid, or further purification. 

A third preferred method of producing multivalent antigen-binding 
proteins is to start with purified single-chain proteins at a lower concentration, 
and then increase the concentration until some significant fraction of 
multivalent proteins is formed. The multivalent proteins are then separated 
and recovered. The concentrations conducive to formation of multivalent 
proteins in this manner are from about 0.5 milligram per milliliter (mg/ml) to 
the concentration at which precipitates begin to form. 

The use of the term "substantially free" when used to describe a 
composition of multivalent and single-chain antigen-binding protein molecules 
means the lack of a significant peak corresponding to the single-chain 
molecule, when the composition is analyzed by cation exchange 
chromatography, as disclosed in Example 1 or by gel filtration 
chromatography as disclosed ifi Example 2. 

By use of the term "aqueous composition" is meant any composition 
of single-chain molecules and multivalent proteins including a portion of 
water. In the same context, the phrase "an excess of multivalent antigen- 
binding protein over single-chain molecules" indicates that the composition 
comprises more than 50% of multivalent antigen-binding protein. 

The use of the term "cross-linking" refers to chemical means by which 
one can produce multivalent antigen-binding proteins from monovalent single- 
chain protein molecules. For example, the incorporation of a cross-linkable 
sulfhydryl chemical group as a cysteine residue in the single-chain proteins 
allows cross-linking by mild reduction of the sulfliydryl group. Both 
monospecific and multispecific multivalent proteins can be produced from 
single-chain proteins by cross-linking the free cysteine groups from two or 
more single-chain proteins, causing a covalent chemical linkage to form 
between the individual proteins. Free cysteines have been engineered into the 
C-terminal portion of the 4-4-20/212 single-chain antigen-binding protein, as 
discussed in Example 5 and Example 8. Thtst free cysteines may then be 
cross-linked to form multivalent aniigen-binding proteins. 
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The invention also comprises single-chain proteins, comprising: (a) a 
first polypeptide comprising the binding portion of the variable region of an 
antibody light chain; (b) a second polypeptide comprising the binding portion 
of the variable region of an antibody light chain; and (c) a peptide linker 
5 linking said first and second polypeptides (a) and (b) into said single-chain 

protein. A similar single-chain protein coniprising the heavy chain variable 
regions is also a part of this invention. Genetic seqrences encoding these 
molecules are also included in the scope of this invention. Since these proteins 
are comprised of two similar variable regions, they do not necessarily have 

10 any antigen-binding capability. 

The invention also includes a DNA sequence encoding a bispecific 
bivalentantigen-binding protein. Exr.mple 4 and Example 7 discusses in detail 
the sequences that appear in Figs. lOA and lOB that allow one of ordinary 
skill to construct a heterobivalent antigen-binding molecule. Figure IDA is an 

15 amino acid and nucleotide sequence listing of the single-chain protein 

comprising the 4-4-20 V, region connected through the 212 linker polypeptide 
to the CC49 V„ region. Figure lOB is a similar lisring of the single-chain 
protein comprising the CC49 V, region cdnnected through the 212 linker 
polypeptide to the 4-4-20 region. Subjecting a composition including these 

20 single-chain molecules to dissociating and subsequent re-associating conditions 

results in the production of a bivalent protein with two different binding 
specificities. 

Synthesis of DNA sequences is well know in the art, and possible 
through at least two routes. First, it is v/eli-known that DNA sequences may 

25 be synthesized dirough the use of automated DNA synthesizers de novo, once 

the primary sequence information is known. Alternatively, it is possible to 
obtain a DNA sequence coding for a multivalent single-cl;ain antigen-binding 
protein by removing the stop codons from Hie end of a gene encoding a single- 
chain antigen-binding protein, and then inserting a linker and a gene encoding 

30 a second single-chain antigen-binding protein. Example 6 demonstrates the 

construction of a DNA sequence coding for a bivalent single-chain antigen- 
binding protein. Other methods of genetically constructing multivalent single- 
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chain antigen-binding proteins come wiLhin the spirit and scope of the present 
invention. 

Having now generally dexribecl this invention the same will better be 
understood by reference to certain specific examples which are included for 
purposes of illustration and are not intended to lirnit it unless otherwise 
specified. 

Example 1 

Production of Multivalent 
Antigen-Binding Pro(ci/iS During Funfication 

In the production of multivalent antigen-binding proteins, the same 
recombinant coli production system that was used for prior single-chain 
antigen-binding protein production was used. See Bird, etaL, Science 242:423 
(1988). This production system produced between 2 and 20% of the total E. 
coli protein as antigen-binding protein. For proieiii recover)', the frozen cell 
paste firom three 10-liter fermentations (600-900 g) was thawed overnight at 
4X and gently resuspended at 4°C in 50 mM Tns-Hcl, LO mM EDTA, 100 
mM KC1,0.1 mM PMSF, pH 8.0 (lysis buffer), using lOliters of lysis buffer 
for every kilogram of wet cell paste. When thoroughly resuspended, the 
chilled mixture was passed three times through a Manton-Gaulin cell 
homogenizer to totally lyse the cells. Because the cell homogenizer raised the 
temperature of the cell lysate lo 25 ±5°C, the cell lysate was cooled to 
5±2^C with a Lauda/Brinkman chilling coil after each pass. Complete lysis 
was verified by visual inspection under a microscope. 

The cell lysate was centrifuged at 24,300g for 30 min. at 6''C using a 
Sorvall RC-5B centrifuge. The pellet containing the insoluble antigen-binding 
protein was retained, and the supernatant was discarded. The pellet was 
washed by gently scraping it from the centrifuge bottles and resuspending it 
in 5 liters of lysis buffer/kg of wet cell paste. The resulting 3.0- to 4.5-liter 
suspension was again centrifuged at 24,300g for 30 min at G^'C, and the 
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supernatant was discarded. This washin? of the pellet removes soluble E. coli 
proteins and can bs repeated as many as five times. At any time during this 
washing procedure the material can be stored as a frozen pellet at -20»C. A 
substantial time saving in the washing steps can be accomplished by utilizing 
a Pellicon tangential flow apparatus equipped with 0.22-/im microporous 
filters, in place of centrifugatioti. 

The washed pellet was solubilized at 4°C in freshly prepared 6 M 
guanidine hydrochloride. 50 mM Tris-MCI, 10 mM CaCl,. 50 mM KCl, pH 
8.0 (dissociating buffer), lising 9 m!/g of pellet. If necessary, a few quick 
pulses from a Heat Systems Ultrasonics tissue homogenizer can be used to 
complete the solubilization. The resulting suspension was centrifiiged at 
24,300g for 45 min at 6°C and the pellet was discarded. The optical density 
of the supernatant was determined at 2S0 nm and if the OD.go was above 30, 
additional dissociating buffer was added to obtain an OD,,o of approximately 
15 25. 

The supernatant was slowly diluted into cold (4-7°C) refolding buffer 
(50 mM Tris-HCl, 10 mM CaCI,. 50 mM KCl, pH 8.0) until a 1:10 dilution 
was reached (final volume 10-20 liters). Re-folding occurs over approximately 
eighteen hours under these conditions. The best results are obtained when the 

20 GuHCl extract is slowly added to the refolding buffer over a 2-h period, with 

gentle mixing. The solution was left undisturbed for at least a 20-h period, 
and 95% ethanol was added to this solurion such that the fmal ethanol 
concentration was approximately 20% . This solution was left undisturbed until 
the flocculated material settled to the bottom, usually not less than sixty 

25 mrautes. The solution was filtered through a 0.2 urn Millipore Millipak 200. 

This filtration step may be optionally preceded by acentrifugation step. The 
filtrate was concenttcited to 1 to 2 liters using an Amicon spiral cartridge with 
a 10,000 MWCO cartridge, again 2t 4°C. 

The concentrated crude antigen-binding protein sample was dialyzed 

30 against Buffer A (60 mM MOPS, 0.5 mM Ca acetate, pH 6.0-6.4) until the 

conductivity was lowered to that of Buffer A, The sample was then loaded on 
a2L5 x 250-mm polyaspartic acid PolyCAT A column, manufactured by Poly 
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LC of Columbia, Maryland. If more than 60 mg of protein is loaded on this 
column, the resolution begins to deteriorate; thus, the concentrated crude 
sample often must be divided into several PolyCAT A runs. Most antigen- 
binding proteins have an extinction coefficient of about 2.0 ml mg'* cm'* at 
5 280 nm and this can be used to determine protein concentration. The anugen- 

binding protein sample was eluied from the PolyCAT A column with a 50-min 
linear gradient from Buffer A to Buffer B (see Table 1). Most of the single- 
chain proteins elute between 20 and 26 minutes when this gradient is used. 
This corresponds to an eluting solvent composition of approximately 70% 

10 Buffer A and 30% Buffer B. Most of the bivalent antigen-binding proteins 

elute later than 45 minutes, which correspond to over 90% Buffer B. 

Figure 7 is a chromatogram depicting the separation of single-chain 
protein from bivalent CC49/212 protein, using the cation-exchange method just 
described. Peak 1, 27.32 minutes, represents the monomeric single-chain 

15 fraction. Peak 2, 55.52 minutes, represents the bivalent protein fraction. 

Figure 8 is a chromatogram of the purified monomeric single-chain 
antigen-binding protein CC49/212 (Fraction 7 from Fig. 7) run on a Waters 
Protein-Pak 300SW gel filtration column. Monomer, with minor contaminates 
of dimer and trimer, is shown. Figure 9 is a chromatogram of the purified 

20 bivalent antigen-binding protein CC49/212 (Fraction 15 from Fig. 7) run on 

the same Waters Protein-Pak 300SV/ gel filtration column as used in Fig. 8. 
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TABLE 1 






PoIyCAT a Cation-Exchnugc HPLC Gradients 








Buffers^ 




Time 
(min)* 


Flow 
(ml/min) 


A 


B 


c 


Initial 


15.0 


100 


0 


0 


50.0 


15.0 


0 


100 


0 


55.0 


15.0 


0 


100 


n 
u 


60.0 


15.0 


0 


0 


100 


63.0 


15.0 


0 


0 


100 


64.Q 


15.0 


100 




0 


67.0 


15.0 


100 


0 


0 


•Linear gradients are run between eacii time point. 




^-Buffer A, 60 mM MOPS, 0.5 nuM Ca acetate, pH 6.0-6.4 
Buffer B, 60 mM MOPS, 20mM Ca acetate, pH 7.5-8.0; 
. Buffer C, 40 mM MOPS. 100 mM Cr.CI^, pH 7.5, ■ ^ 





20 



25 



30 



Pispurification procedure yieldedmultivalentantigsn-bindingproteins 
that are more than 95% pure as examined by SDS-PAGE and size exclusion 
HPLC. Modifications of the above procedure may be dictated by the 
isoelectric point of the panicular n-:uliivalent antigen-binding protein being 
purified. Of the nionomeric single-cliairi proteins that have been purified to 
date, all have had an isoelectric point (pi) between 8.0 and 9.5. However, it 
is possible that a multivalent antigen-binding protein may be produced with a 
pi of less than 7.0. In that case, an anion exchange column may be required 
for purification. 

The CC49 monoclonal antibody was developed by Dr. Jeffrey Schlom's 
group. Laboratory of Tumor Immunology and Biology, National Cancer 
Institute. It binds specifically to the pan-carcinoma tumor antigen TAG-72. 
SeeMuraro. R. et aL, Cancer Research ^5:4588-4596 (1988). 

To determine the binding properties of the bivalent and monomeric 
CC49/212 antigen-binding proteins, a competition radioimmunoassay (RIA) 
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was set up in which a CC49 IgG (with two antigen binding sites) radiolabeled 
with was competed against unlabeled CC49 IgG, or monovalent (fraction 
7 in Figure 7) or bivalent (fraction 15 in Figure 7) CC49/212 antigen-binding 
protein for binding to the TAG-72 antigen on a human breast carcinoma 
extract. (See Figure 18). This competition RIA showed that the bivalent 
antigen-binding protein competed eqiinlly well for the antigen as did IgG, 
whereas the monovalent single-chain antit^en-binding protein needed a ten-fold 
higher protein concentration to displace the IgG. Thus, the monovalent 
antigen-binding protein competes with about a ten-fold lower affinity for the 
antigen than does the bivalent IgG or bivalent antigen-binding protein. Figure 
18 also shows the result of the competition RIA of a non-TAG-72 specific 
single-chain antigen-binding protein, liie anti fluorescein 4-4-20/212, which 
does not compete for binding. 

Example 2 

Process of Maki/ig Multivalent 
Antigen-Binding ProteiriS Using Dissociating Agents 

A. Process Using Guanidine HCl and EtJianol 

Multivalent antigen-binding proteins were produced from purified 
single-chain proteins in the following way. First the purified single-chain 
protein at a concentration of 0.25-4 n-g/m! was dialyzed against 0.5 moles/liter 
(M) guanidine hydrochloride (GuHCl), 20% ethanol (EtOH), in 0.05 M TRIS, 
0.05 M KCi, 0.01 M CaCl. buffer pH 8.0. This combination of dissociating 
agents is thought to disrupt the Vl/V„ interface, allowing the Vh of a first 
single-chain molecule to come into contact with a V, from a second single- 
chain molecule. Other dissociating agents such as urea,' and alcohols such as 
isopropanol or methanol should be substitutable for GuHCl and EtOH. 
Following the initial dialysis, the protein was dialyzed against the load buffer 
for the final HPLC purification step. Two separate purification protocols, 
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cation exchange and gel filtration chromatography, can be used to separate the 
single-chain protein monomer from the multivalent antigen-binding proteins. 
In the first method, monomeric and multivalent antigen-binding proteins were 
'separated by using cation exchange HPLC chromography, using a 
5 polyaspartate column (PolyCAT A). This was a similar procedure to that used 

in the final purification of the antir,en-binding proteins as described in 
Example 1. The load buffer was 0.06 M MOPS, 0.001 M Calcium Acetate 
pH 6.4. In the second method, the monomeric and multivalent antigen- 
binding proteins were separated by gel filtration HPLC chromatography using 

10 as a load buffer 0.04 M MOPS, 0.10 M Calcium Acetate pH 7.5. Gel 

filtration chromatography separates proteins based on their molecular size. 

Once the antigen-binding protein sample was loaded on die cation 
exchange HPLC column, a linear gradient was run between the load buffer 
(0.04 to 0.06 M MOPS, 0.000 to 0.001 M calcium acetate, 0 to 10% glycerol 

15 pH 6.0-6.4) and a second buffer (0.04 to 0.06 M MOPS, 0.01 to 0.02 M 

calcium acetate, 0 to 10% glycerol pH 7.5). It was imponant to have 
extensively dialyze the antigen-binding i;rotein sample before loading it on tiie 
column. Normally, die conductivity cf the sample is monitored against the 
dialysis buffer. Dialysis is continued until the conductivity drops below 600 

20 iiS. Figure 11 shows die separation of the monomeric (27.83 min.) and 

bivalent (50.47 min.) forms of the CC49/212antigen-bindingproteinby cation 
exchange. The chromatographic conditions for this separation were as 
follows: PolyCAT A column, 200 x 4.6mm, operated at 0.62 ml/min.; load 
buffer and second buffer as in Example 1; gradient program from 100 percent 

25 load buffer A to 0 percent load buffer A over 48 mins; sample was CC49/212. 

1.66mg/ml; injection volume 0.2 mL Fractions were collected from die two 
peaks from a similar chromatogram end identified as monomeric and bivalent 
proteins using gel filtration HPLC chromatography as described below. 

Gel filtration HPLC chromatography (TSK G2000S\V column from 

30 Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 

single-chain and multivalent antigen-binding proteins. This procedure has 
been described by Fukano, et ai. J. Chromaiography 166:47 (1978). 
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Multimerization (creation of multivaleiu protein from monoineric single-chain 
protein) was by treatment with 0.5 M GuHCl and 20% EtOH for the times 
indicated in Table 2A followed by dialysis into the chromatography buffer. 
Rgure 12 shows the separation of monomeric (17.65 min.), bivalent (15.79 
min.), trivalent (14.19 min.), and higher oligomers (shoulder at about 13.09 
min.) of the B6,2/212 antigen-binding protein. The B6. 2/212 single-chain 
antigen-binding protein is described in Coicher, D., et oL, J. Nat. Cancer 
Inst, 52:1191-1197 (1990)). This separation depicts the results of a 24-hour 
multimerization treatment of a 1.0 mg/m! B6. 2/212 antigen-binding protein 
sample. The HPLC buffer used was 0.04 M MOP-S, 0.10 M calcium acetate, 
0.04% sodium azide, pH 7.5. 

Figure 13 shows the results of a 24-hour treatment of a 4.0 mg/ml 
CC49/212 antigen-binding protein sam[):e, generating monomeric, bivalent and 
trivalent proteins at 16.91, 14.9, and 13.42 min., respectively. The HPLC 
buffer was 40 mM MOPS, 100 mM calcium acetate, pH 7.35. 
Multimerization treatment was for the limes indicated in Table 2. 

The results of Example 2A are shown in Table 2A. Table 2A shows 
the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and 0.5M GuHCl. Unless otherwise indicated, 
percentages were determined using a automatic data integration software 
package. 
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protein 



CC49/212 



Table 2 A 

Summary of the generation of bivalent and higher 

multivalent forms of B6.2/2i2 and CC49/212 
proteins using gxianidine hydrochloride and ethanol 



Time 
(hours) 


Concentration 
(mg/ml) 


monomer 


% 

dimer 


trimcr 


multimcrs 


0 


0.25 


S5.7 


11.6 


1.7 


0.0 


0 


1.0^ 


S4.0 


10.6 


5.5 


0.0 


0 


4.0 


70.0 


17.1 


12.9* 


0.0 


2 




n2.9 


33.2 


4.2 


0.0 


2 


l.O 


24.2 


70.6 


5.1 


0.0 


2 


4.0 


9.3 


81.3 


9.5 


0.0 


26 


0.25 


16.0 


77.6 


6.4 


0.0 


26 


1.0 


9.2 


82.8 


7.9 


0.0 


26 


4.0 


3.7 


78.2 


18.1 


0.0 


0 


0.25 


100.0 


0.0 


0.0 


0.0 


0 


1.0 


iro.o 


0.0 


0.0 


0.0 


0 


4.0 


u;o.o 


0.0 


0.0 


0.0 


2 


0.25- 


OS.l 


1.9 


0.0 


0.0 


2 


1.0 


100.0 


0.0 


0.0 


0.0 


2 


4.0 


':o.o 


5,5 


IJO 


0.0 


24 


0.25 


45.6 


37-5 


10.2 


6.7 


24 


1.0 


50.8 


21.4 


12.3 


15.0 


24 


4.0 


5.9 


37.2 


25.1 


29.9 



10 



« Based on cut out peaJcs that were weighted. 
* Average of nvo experiments. 



B. Process Using Urea and Ethanol 



15 



Multivalent antigen-binding proteins were produced from purified 
single-chain proteins in the following '.vay. First the purified single-chain 
protein at a concentration of 0.25- 1 mg/ral was dialyzed against 2M urea, 20% 
ethanol (EtOH), and 50mM Tris buffer pH 8.0, for.the times indicated in 
Table 2B. This combination of dissocir.ting agents is thought to disrupt the 
VlA^h interfece. allowing the V„ of a f.rst single-chain molecule to come into 
contact widi a V, from a second single-chain molecule. Other dissociating 
agents such as isopropanol or methanol should be substitutable for ElDH. 
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Following the initial dialysis, the protein was dialyzed against the load buffer 
for the final HPLC purification step. 

Gel filtration HPLC chromatography (TSK G2000SW column from 
Toyo Soda, Tokyo, Japan) was used to identify and separate monomeric 
5 single-chain and muUivalent antigen-binding proteins. This procedure has 

been described by Fukano. et al., J, Chromatography 166:47 (1978). 

The results of-Example 2B are r hown in Table 2B, Table 2B shows 
the percentage of bivalent and other multivalent forms before and after 
treatment with 20% ethanol and urea. Percentages were determined using an 
10 automatic data integration software package. 

Table 2B 

Summary of the generation of bivalent and higher 
multivalent forms of 





B6.2/212 


and CC49/212 


proteins usin 


g urea and 


ethanol 




protein 


Time 

(hours) 


Conceniralion 

(mg/rn!) 


mononicr 


% 

dirncr ■ 


[rimer 


mul timers 


B6.2 


0 


0.25 


44.! 


37.6 


15.9 


2.4 




0 


I.O 


37J 


33.7 


19,4 


9.4 




3 


0.25 


?? *• 


66.5 


11.3 


0.0 




3 


1.0 


13.7 


69.9 


16.4 


0.0 



Ex am pie 3 

Determination of Binding Constants 

Three anti-fluorescein single-chain antigen-binding proteins have been 
20 constructed based on the anti-fluorescein monoclonal antibody 4-4-20. The 

three 4-4-20 single-chain antigen-binding proteins differ in the polypeptide 
linker connecdng the Vh and Vi. region.^ of the protein. The three linkers used 
were 202', 212 and 216 (see Table 3). Bivalent and higher forms of the 4-4- 
20 antigen-binding protein were produced by concentrating the purified 
25 " monomeric single-chain antigen-binding protein in the cation exchange load 

buffer (0.06 M MOPS, 0.001 M calcium acetate pH 6.4) to 5 mg/ml. The 



bivalent and monomeric forms of ilie 4-4-20 antigen-binding proteins were 
separated by cation exchange HPLC (polyaspartate colurr.n) using a 50 min. 
linear gradient between the load buffer (0.06 M MOPS, 0.001 M calcium 
acetate pH 6.4) and a second buffer (0.06 M MOPS, 0.02 M calcium acetate 
pH 7.5). Two 0.02 ml samples were separated, and fractions of the bivalent 
and monomeric protein peaks were collected on each run. The amount of 
protein contained in each fraction was determined from the absorbance at 278 
rnn from the first separation. Reforc collecting the fractions. from the second 
separation run. each fraction tube hnd ?. sufficient quantity of 1.03 x 10^ M 
fluorescein added to it, such that after the fractions were collected a 1-to-l 
molar ratio of protein-to-fluoresccin existed. Addition of Ouorescein stabilized 
the bivalent form of the 4-4-20 antigen-binding proteins. These samples were 

kept at I'C (on ice). 

The fluorescein dissociation rates were determined for each of these 
samples following the procedures described by Herron. J.N., m Fluorescence 
Hapten: An Immunological Probe, E.W. Voss, Ed., CRC Press, Boca Raton. 
FL (1984). A sample was ilrst diluted with 20 mM HEPES buffer pH 8.0 to 
5.0 X 10^ M 4-4-20 antigen-binding protein. 560 of the 5,0 x lO"* M 4-4- 
20 antigen-binding protein sample was added to a cuvette in a fluorescence 
spectrophotometer equilibrated at 2°C and the fluorescence was read. 140 ,.1 
of 1.02 X 10-^ M fluorcsceinamine was added to the cuvette, and the 
fluorescence was read every 1 minute for up to 25 minutes (see Table 4). 

The binding constants (KJ for the 4-4-20 single-ciiain amigen-binding 
protein monomers diluted in 20 mM IIHPES bufferpH 8.0 in the absence of 
fluorescein were also determined (see Table 4). 

The three pol>T)eptide linkers in these experiments differ in length. 
The 202', 212 and 216 linkers are 12, 14 and 18 residues long, respectively. 
These experiments show that thei e are two effects of linker length on the 4-4- 
20 antigen-binding proteins: first, the shorter the linker length the higher the 
fraction of bivalent protein fomed; second, the fluorescein dissociation rates 
of the monomeric single-chain antigen-binding proteins are effected more by 
the linker length than are the dissocintion rates of the bivalent antigen-binding 
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proteins. With the shorter lin]:ers 20?/ and 212, the bivalent antigen-binding 
proteins have slower dissociation rates ihan the monomers. Thus, the linkers 
providing optimum production and binding affinities for monomeric and 
bivalent antigen-binding proteins may be different. Longer linkers may be 
5 more suitable for monomeric single-chai n antigen-binding proteins, and shorter 

linkers may be more suitable for muliivalcnt antigen-binding proteins. 



10 



Linker Dt^i^ins 




Linker 




Linker 
Name 


Reference 


-KIiEIE 




TQKI.O- 


202' 


Bird et at. 


-KLEIK 


GSTSGSGKSSEGKG^ 


EVKLD- 


212 


Bedzyk et oL 


-KLEIK 


GSTSGSGK£S£GSGST?;G^ 


;:VKLD- 


216 


Tills application 


-KLVLK 


GSTSGKPSEGKG* 


EVKL3- 


217 


Tliis application 


(1) SEQ ] 


D NO. 1 



15 (2) SEQ ID NO. 2 



(3) SEQ ID NO. 3 

(4) SEQ ID NO. 4 



20 





Table. 4 






Effects of Linkers on the SCA Protein Monomers ni.d Dimers 






Linker 






202' 


212 


216 


Monomer 
Fraction 
Ka 

Dissociation rate 


0.47 
0.5 X 

8.2 X 10"^ 


0,66 
1.0 X 10^ 

4.9 X 10-^ S-' 


0.90 
1.3 X 10" M ' 
3.3 X lO-'s' 


Dimer 
Fraction 

Dissociation rate 


0.53 
4.6 X 10^ 


3.5 X 10-' S-' 


0.10 
3.5 X 10*^ s ' 


Monomer/Dimer 
Dissociation rate ratio 


1.8 


1.4 


0.9 



Example 4 
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Genetic Construction of a Mixtd-Fragment Bivalent Antigen- 

Bindiivj Protein 



The genetic constructions for one particular heterobivalent antigen- 
binding protein according to the Rearrangement model are shown in Figures 
5 lOA and lOB. Figure lOA is nn ninino acid and nucleotide sequence listing 

of the 4-4-20 Vl/212/CC49 V,, const! :ct, coding for a single-chain protein 
with a 4-4-20 Vj,, linked via a 212 polypeptide linker to a CC49 Vh- Figure 
lOB is a similar listing showing the CC49 V,/212/4-4-20 Vh construct, coding 
for a single-chain protein with a CC49 V^, linked via a 212 linker to a4-4-20 
10 V„. These single-chain proteins may recombine according to the 

Rearrangement model to generate a heterobivalent protein comprising a CC49 
antigen-binding site linked to a 4-4-20 nntigen-binding site, as shown in Figure 
5B. 

"4-4-20 Vl" means the varinb! • region of the light chain of the 4-4-20 

15 mouse monoclonal antibody (Rird. R.E. et al., Science 242:423 (1988)). The 

number "212" refers to a sped iic 1 4-residue polypeptide linker that links the 
4-4-20 Vl and the CC49 V„. See Bcdz>i'k, W.D. et nl., /. Biol. Chem. 
265:18615-18620 (1990). "CC49 Vh" is the variable region of the heavy 
chain of the CC49 antibody, which bin^s to the TAG-72 antigen. The CC49 

20 antibody was developed at The National Institutes of Health by Schlom, et al. 

Generation and Charaaeriia'.ijn of B72.3 Second Generation Monoclonal 
Antibodies Reactive WUh Vie Tumor-associated Glycoprotein 72 Antigen. 
Cancer Research 48:4588-4596 (198?.). 

Insertion of the sequences rrown in FIGS. lOA and lOB, by standard 

25 recombinant DNA methodology, into a 5:uitableplasmid vector will enableone 

of ordinary skill in the art to transiorm a suitable host for subsequent 
expression of the single-chain proteins. Se^ Maniatis et al.. Molecular 
Cloning, A Laboratory- Manual, p. 104, Cold Spring Harbor Laboratory 
(1982), for genera! recombinant techniques for accomplishing the aforesaid 

30 goals; see a]so U.S. Patent 4,946,778 (Ladner et al) for a complete 
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description of methods of producing single-chain protein molecules by 
recombinant DNA technology. 

To produce multivalent antigen binding proteins from the two single- 
chain proteins, 4-4-20VL-212/CC49Vn ai^.d CC49Vl/212/4-4-20Vh, the two 
single-chain proteins are dialyzed into 0.5 M GuHCl/20% EtOH being 
combined in a single solution either before or after dialysis. The multivalent 
proteins are then produced and separated as described in Example 2. 

Example 5 

Preparation of Multivalent 
Antigen-Binding Proteins by Chemical Cross-Linking 

Free cysteines were engineere:! inio the C-tcrminal of the 4-4-20/212 
single-chain antigen-binding protein, in order to chemically crosslink the 
protein. The design was based on :!ie hinge region found in antibodies 
between the C^l and C^(2 regions. In order to try to reduce antigenicity in 
humans, the hinge sequence of the n^ost common IgG class, IgGl, was 
chosen. The 4-4-20 Fab stnicmre was examined and it was determined that 
the C-terminal sequence Gl'jH216-ProH217-ArgH218, was part of the ChI 
region and that the hinge between C„l and C„2 starts with ArgH218 or 
GlyH219 in the mouse 4-4-20 IgG2A antibody. Figure 14 shows the structure 
of a human IgG. The hinge region is indicated generally. Thus the hinge 
from human IgGl would start with LysK218 or SerH219. (See Table 5). 

The C-terminal residue in most of the single-chain antigen-binding 
proteins described to date is the amino : cid serine. In the design for the hinge 
region, the C-terminal serine in the 4-4-20/212 single-chain antigen-binding 
protein was made the first serine of the hinge and the second residue of the 
hinge was changed from a cysteine to a serine. This hinge cysteine normally 
forms a disulfide bridge to tl:e C-tcrminal cysteine in the light chain. 
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TABLE 5 

213 

TaG2A mouse* EPr't-^PTIKP CPPCLC- 

5 h^l^an' S C D K T H T C P P C - 

'SCA** - - V T V S 

SCA* Hinge design l' --^TVSSDKTHTC 
SCA* Hinie design 2= --VTVSSDKTHTCPPC 

*- single-chain anrigen-binding protein 

10 (1) SEQ ID NO. 5 

(2) SEQ ID NO. 6 

(3) SEQ ID NO. 7 

(4) SEQ ID NO. 8 

(5) SEQ ID NO. 9 

15 There are possible advantages to having two C-terminal cysteines, for 

they might form an intramolecu 1 ar d isu ! il de bond , making the protein recovery 
easier by protecting the sulfurs from oxidation. The hinge regions were added 
by introduction of a BstE II restriction site in the 3'-terminus of the gene 
encoding the 4-4-20/212 single-chain antigen-bindingprotein (see Figures 15A- 

20 15B). 

The monomeric single-chain antigen-binding protein containing the C- 
terminal cysteine can be purified nsing the normal methods of purifying a 
single-chain antigen-binding proteins, with minor modifications to protect die 
free sulfhydryls. The cross-linking could be accomplished in one of two 

25 ways. First, the purified single-chain antigen-bindingprotein could be treated 

■ with a mild reducing agent, such as dithiothreitol. then allowed to air oxidize 
.to form adisulfide-bond between the individual single-chain antigen-binding 
proteins. This type of cir.nistry has been successful in producing 
heterodimers from whole antibodies (Nisonoff et al. Quantitative Estimation 

30 of the Hybridization of Rabbit Antibodies. Nature ^S26: 355-359 (1962); 

Brennan etal. Preparation of Bispecific Antibodies by Chemical 
Recombination ofMonoclonal ImniunoglobulinG, Fragments, Science229:%l- 
83 (1985)). Second, chemical cross; inlcing agents such as temaleimidehexane 
could be used to cross-link two singie-ch.ain antigen-binding proteins by their 

35 C-terminal cysteines. See Pa:. is ci al. J. Prot. Chem. 2:263-277 (1983). 



wo 93/11161 



PCr/US92/09965 



- 37 - 
Example 6 

Genetic Construction of Bivalent Antigen-Binding Proteins 

Bivalent antigen-binding proteins can be constpjctcd genetically and 
subsequently expressed in E, coli or other known expression systems. This 
can be accomplished by geneiically rj:noving the stop codons at the end of a 
gene encoding a monomeric single-chain antigen-binding protein and inserting 
a linker and a gene encoding a second single-chain antigen-binding protein. 
We have constructed a gene ior a bivalent CC49/212 antigen-binding protein 
in this manner (see Figure 16). The CC49/212 gene in the starting expression 
plasmid is in an Aat 11 to Bam HI restriction fragment (see Bird et al, Single- 
Chain Antigen-Binding Proteins^ Science 2^2:423-426 (1988); and Whitlow 
et al, Single-Chain Fv Proteins and Their Fusion Proteins, Methods 2:77-105 
(1991)). The two stop codons and the Bam HI site at the C-terminal end of 
the CC49/212 antigen-bindinp pn^iein gens were replaced by a single residue 
linker (Ser) and an Aat II restriction site. The resulting plasmid was cut with 
Aat II and the purified Aat II to Aat H restriction fragment was ligated into 
Aat II cut CC49/212 single-chain antigen-binding protein expression plasmid. 
The resulting bivalent CC49/212 single-chain antigen-binding protein 
expression plasmid was transfected into an E. coli expression host that 
contained the gene for the cI857 temperature-sensitive repressor. Expression 
of single-chain antigen-binding protein in this system is induced by raising the 
temperature from 30°C to 42 -C. Fig. 17 shows the expression of the divalent 
CC49/212 single-chain antigen-binding protein of Fig. 16 at 42^C, onanSDS- 
PAGE gel containing total E. coli protein. Lane 1 contains the molecular 
weight standards. Lane 2 is the uninduced E. coli production strain grown at 
SO'^C. Lane 3 is divalent CC49/212 single-chain antigen-binding protein 
induced by growth at 42''C. The nrrow shows the band of expressed divalent 
CC49/212 single-chain antigen-binding protein. 
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Example 7 

Constniction, Purification, ami Testing of 4-4-20/CC49 
Heterodimcr Fy With 21 7 Linkers. 

The goals of this experiment were to produce, purify and analyze for 
5 activity a new heterodimer Fv ti-.u wouid bind to both fluorescein and the pan- 

carcinoma antigen TAG-72. The design consisted of two polypeptide chains, 
which associated to form the active heterodimer Fv. Each polypeptide chain 
can be described as a mixed sinr.le-chain Fv (mixed sFv). The first mixed sFv 
(GX 8952) comprised a 4-4-20 variable light chain' (VO and a CC-49 variaHe 
10 heavy chain (V„) connected by a 217 polypeptide linker (Figure 19A). The 

second mixed sFv (GX 8953) comprised a CC-49 and a 4-4-20 Vh 
connected by a 217 polypeptide linker (Figure 19B). The sequence of the 217 
polypeptide linker is shown in Tnble 3. Construction of analogous CC49/4-4- 
• 20 heterodimers connected by a 212 pobTeptide linker as described in 
15 Example 4. 

Results 

A. Purification 

One 10-liter fermentation of each mixed sFv was grown on casein 
digest-glucose-salts medium at 32°C to an optical density at 600 nm of 15 to 

20 20. The mixed sFv expression was induced by raising the temperature of the 

fermentation to 42°C for one ho jr. 277gm (wet cell weight) oiE, coU strain 
GX 8952 and 233gm (wet cell weipjit) of E. coll strain GX 8953 were 
harvested in a centrifuge at 70 . Og for 10 minutes. The cell pellets were kept 
and the supemate discarded. The ccli pellets were frozen at -20°OC for 

25 storage. 
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2.55 liters of "lysis/wash buffer" (50mM Tris/ 200niM NaCl/ 1 mM 
EDTA, pH 8.0) was added to both of tlie mixed sFv's cell pellets, which were 
previously thawed and combiitcd to give SlOgm of total wet cell weight. After 
complete suspension of the cells they were then passed through a Gaulin 
5 homogenizer at 9000psi and 4°C. After this first pass the temperature 

increased to 23°C. The temperature was immediately brought down to O'^C 
using dry ice and methanol. The eel! suspension was passed through the 
Gaulin homogenizer a second time and centrifuged at 8000 rpm with a Dupont 
GS-3 rotor for 60 minutes. The snpernaiant was discarded after centrifugation 

10 and the pellets resuspendcd in 2.5 liters of "lysis/wash buffer" at 4**C. This 

suspension was centrifuged for 45 minutes at 8000 rpm with the Dupont GS-3 
rotor. The supernatant was again discarded and the pellet weighed. The 
pellet weight was 136.1 gm. 

1300ml of 6M Guanidine i-Iydrochloride/50mM Tris/50mM KCl/lOmM 

15 CaClapH 8.0 at 4''C was added to the washed pellet. An overhead mixer was 

used to speed solubilization. After one hour of mixing, the heterodimer 
GuHCI extract was centrifuged for 45 minutes at 8000 rpm and the pellet was 
discarded. The 1425ml of heterodimer Fv 6M GuHCI extract was slowly 
added (16 ml/min) to 14,1 liters of "Refold Buffer" (50mM Tris/50mM 

20 KCl/lOmM CaCK, pH 8.0) under constant mixing at 4°C to give an 

approximate dilution of 1:10. Refolding took place overnight at 4*'C. 

After 17 hours of refolding the aiui-fiuorescein activity was checked by 
a 40% quenching assay, and the am.ount of active protein calculated. 150mg 
total active heterodimer Fv was found by the 40% quench assay, assuming a 

25 54,000 molecular weight. 

4 liters of prechilled (4°C) 190 proof ethanol was added to the 15 liters 
of refolded heterodimer wiih mixi:vz fcr 3 hours. The mixture sat overnight 
at4°C. A flocculcnt precipitaie hnci scrJed to the bottom, after this overnight 
treatment. The nearly clear solution was filtered through a Millipak-200 

30 (0.22/i) filter so as to not dislurb tlve precipitate. A 40% quench assay 

showed that 10% of the anti-flucrescein activity was recovered in the filtrate. 
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ID 



The filtered sample of hetcrodimer was dialyzed, using a Pellicon 
system containing 10,000 dalron MWCO membranes, with "dialysis buffer" 
40mM MOPS/0.5mM Calciura Acetate (CaAc), pH 6.4 at 4°C. 20 liters of 
'dialysis buffer was required be fore the conductivity of the retentate was equal 
to that of the dialysis buffer (~ 500;tS) . After dialysis the heterodimer sample 
was filtered through a MiIlipak-20 filter. 0.22;:. After this step a 40% quench 
assay showed there was 8.8 mg of p.ctive protein. 

The crude heterodimer sample was loaded on a Poly CAT A cation 
exchange column at 20ml/min. The column was previou.sly equilibrated with 
60mM MOPS, 1 mM CaAc pll 6.4, at 4°C, (Buffer A). After loading, the 
column was washed with 150ml of "Buffer A" at I5ml/min. A 50min linear 
gradiem was performed at 15ml/miri u^ing "Buffer A" and "Buffer B" (60mM 
MOPS, 20mM CaAc pH 7.5 at 4°C). The gradient conditions are presented 
in Table 6. "Buffer C" comprises 60mM MOPS, lOOmM CaClj, pH 7.5. 



15 



20 



25 



Tabled • 


Time 






%c 


Flow 


0:00 


iOO.D 


0.0 


0.0 


ISml/min 


50:00 


0.0 


100.0 


0.0 


15ml/min 


52:00 


0.0 


100.0 


0.0 


15ml/min 


54:00 


' 0.0 


0.0 


100.0 


ISml/min 


58:00 


0.0 


0.0 


100.0 


15ml/min 


60:00 


100.0 


0.0 


0.0 


15iri]/min 



Approximately JOml frnctions ..vcre collected andanalyzed for activity, 
purity, and molecular weigiit by i^ize-cxcIusioiA clKomatography. Thefractions 
were not collected by peak^ so coniamination between peaks is likely. 
Fractions 3 through 7 were pooled (to::il voluir.e - 218ml), concentrated to 
50ml and dialyzed against 4 liters of 60mM MOPS, O.SmM CaAc pH 6.4 at 
4»C overnight. The (^^alyzed pool v^n.s filtered through a 0.22;t filter and 
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.un- 
checked for absorbance ai 2S0r:m. The fi I irate was loaded onto the PolyCAT 
A column, equilibrated with 60mM MOPS, 1 iiiM CaAc pH 6.4 at 4°C, at a 
flow rate of lOml/min. Buffer D was cnan^:ed to 60mM'MOPS, lOmM CaAc 
pH 7.5 at 4''C. The gradient was run as in Table 6. The fractions were 
collected by peak and analyzed for activity, purity, and molecular weight. 
The chromatogram rs ::hown in Figure 20. Fraction identification and analysis 
is presented in Table 7. 



Tuhlc 7 

Fraction Analysis of the Heicrodimer Fv protein 


Fraction 
No. 


A^gp reading 


Total Volume 
(mi) 


HPLC-SE Elution Time 

(min) 


2 


0.161 


36 


20.525 


3 


0.067 


. 40 




4 


0.033 


40 




5 


0.17S 


45 


19.133 


6 


0.234 


50 


19.163 


7 


0.069 j 50 




8 


0.055 j ^:o 





Fractions 2 to 7 and the starrng material were analyzed by SDS gel 
electrophoresis, 4-20 Vj. A picture and description of the gel is presented in 
Figure 21. 

B, IIPLC Size ExvluKlon ResiiUs 

Fractions 2, 5, and 6 correspond to 'the three main peaks in Figure 20 
and therefore were chosen to be nnalyzed by HPI.C size exclusion. Fraction 
2 corresponds to the peak that runs at 21.775 minutes in the preparative 
purification (Figure 20), and runs on the HPLC sizing column at 20.525 
minutes, which is in the monomeric position (Figure 22A). Fractions 5 and 
6 (30.1 and 33.455 minutes, respecrively, in Figure 20) run on the HPLC 
sizing column (Figures 22B and 22C) at 19.133 and 19.163 minutes. 
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respectively (see Table 7). Therefore, both of these peaks could be considered 
dimers. 40% Quenching assays were performed on all fractions of this 
purification. Only fraction 5 gave significant activity. 2.4 mg of active CC49 
'4-4-20 heterodimer Fv was recovered in fraction 5, based on the Scatchard 
5 analysis described belo\'.'. 

C. N-termiiinl scquen cing of th c fractions 

The active heterodimer Fv fraction should contain both polypeptide 
chains. N-terminal sequence analysis showed that fractions 5 and 6 displayed 
N-terminal sequences consistent with the prexencc of both CC49 and 4-4-20 
10 polypeptides and fraction 2 displayed a single sequence coiresponding to the 

CC49/212/4-4-20 pohi^eptide only. We believe that fraction 6 was 
contaminated by fraction 5 (see l-igure 20), since only fraction 5 had 
significant activity. 

D. Anti-fluorescein activity by r::atchard analysis 

15 The . fluorescein associj^tion constants (M were determined for 

fractions 5 and 6 using the fluorescence quenching assay described by Herron, 
J.N., in Fluorescence Hapten: An Im'niinolosical Probe, E.W. Voss, ed., 
CRC Press, Boca Raton, FL (1984). Each sample was diluted to 
approximately 5.0 x lO* M with 20 mM HEPES bufferpH 8.0. 590 /tl of the 

20 5.0 X 10"* M sample was added to a cuvette in a fluorescence 

spectrophotometer equilibrated at room temp^^rature. In a second cuvette 590 
H\ of 20 mM HEPES buffer pH 8.0 was added. To each cuvette was added 
10 fd of 3.0 X 10-' M fluorescein in 20 mM HEPES buffer pH 8.0, and the 
fluorescence recorded. This is repeated uniil 140 al of fluorescein had been 

25 added. The resulting Scatchard analysis :br fraction 5 shows a binding 

constam of 1.16 x 10' M"' for fraction #5 (see Figure 23). This is very close 
to the 4-4-20/212 sFv constant of 1.1 x 10' M"' (see Pantoliano et at.. 
Biochemistry 50:10117-10125 (1991)). The R intercept on the Scatchard 
analysis represents the fraction of active material:. For fraction 5, 61% of the 
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material was active. The gra; ii of the Sc.iichard analysis on fraction 6 shows 
a binding constant of 3.3 x 10^ M'' and 14% active. The activity that is 
present in fraction 6, is most iikely contaminants from fraction 5. 

E. Anti'TAG-72 activity by competiiion ELISA 

5 ' The CC49 monoclonal nntibodv was developed by Dr. Jeffrey Schlom's 

group, Laboratory of Tumor Immunology and Biology, National Cancer 
Institute. It binds specificallv to the pan cfircinoma tumor antigen TAG-72. 
See Muraro, R., et at,. Cancer Research .^5:4588-4596 (1988). 

To determine the bincjiig properties of the bivalent CC49/4-4-20 Fv 

10 (fraction 5) and the CC49/212 sFv, a competition enzyme-linked 

immunosorbent assay (ELISA) was set un in which a CC49 IgG labeled with 
biotin was competed against unlabeled CC49/4-4-20 Fv and the CC49/212 sFv 
for binding to TAG-72 on a h'.iman breast cnrcinoma extract (see Figure 24), 
The amount of biotin-labeled CC49 IgG v. as determined using a preformed 

15 complex with avidin and biotin coupled to horse radish peroxidase and O- 

phenylenediamine dihydrochK=ride (Oi^D). The reaction was stopped with 4N 
H2SO4 (sulfuric acid), after 10 min. and the optical density read at 490nm. 
This competition ELISA showed th:it the bi'- alent CC49/4-4-20 Fv binds to the 
TAG-72 antigen. The CC49/4-4-20 Fv needed a two hundred-fold higher 

20 protein concentration to displ'^v'^e the IgG li-an the single-chain Fv. 



Exaniple S 
Cross-Linking Antigen- Binding Dimers 



We have chemically crosslinked dinv-rs of 4-4-20/212 antigen-binding 
protein with the two cysteiiiC C^terriiiiial cxiension (4-4-20/212 CPPC single- 
25 chain antigen-binding protein) in two wnys. In Example 5 we describe the 

design and genetic consiructicn of the 4-4-20/212 CPPC single-chain antigen- 
binding protein (hinge design 2 in Table 5). Figure 15B shows the nucleic 
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acid and protein sequences of tiiis proicin. After purifying the 4-4-20/212 
CPPC single-chain antigen-binding proicin, using the methods described in 
Whitlow and Filpula, Meth. Enzymol. 2:91 (1991), dimcrs were formed by 
two methods. First, the free cys^2ines were mildly reduced with dithiothreitol 
5 (DTD and then the disul nde-bonds betv, een the two molecules were allowed 

to form by air oxidation. Second. t!ie chemical crosslinker bis- 
maleimidehexane was used to produce dimers by crosslinking the free 
cysteines from two 4-4-20/212 CPPC single-chain antigen-binding proteins. 
A O.I mg/ml solution cf the 4-4-20/212 CPPC single-chain antigen- 
ic binding protein was mildly reduced using 1 niM DTP, 5G mM HEPES, 50mM 
NaCl, 1 mM EDTA buff-r pH 8.0 at 4°C. Tlie samples were dialyzed against 
50mM HEPES, 50 mM NaCI. 1 mM EDTA buffer pH 8.0 at4''C overnight, 
to allow the oxidation of free sulfhydrals to intermolecuiar disulfide-bonds. 
Figure 25 shows a non-reducii 3 SDS-"AC;S gel after the air oxidation; it 
15 shows that approximately 10% of the 4-4-20/212 CPPCprotein formed dimers 
with molecular weigh: :> around 55,000 Dniicis. 

A 0.1 mg/ml solution of the 4-4-20/: 12 CPPC single-chain antigen- 
binding protein was tresned with 2 mM i)i5-m;-!eimidehexane. Unlike forming 
adisulfide-bond between two fee cysteines in the previous example, th&bis- 
20 ■ maleimidehexane crosslinker m^iorhl siiould he stable to reducing agents such 

as |8-mercaptoethanol. Figure 26 shows that approximately 5 % of the treated 
material produced dirrer with n molecular weight of 55,000 Daltons on a 
reducing SDS-PAGE re! (samples were treated with ^-mercaptalethanol prior 
to being loaded on the gel). We Oinher purified the i(>maleimidehexane 
25 treated 4-4-20/212 CPPC protein on PolyCAT A cation exchange column after 

the protein had been e.--nsively diniyzcd against buffer A. Figure 26 shows 
that we were able to enhr ce the f:r,,:iion containing the dimer to 
approximately 15%. 
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Conclusions 

We have prod'iced a hetcrodimer Fv from two complementary mixed 
sFv's which has been shown to have the size of a dimer of the sFv's. The N- 
terminal analysis has shown ihat the act!\e heterodimer Fv contains two 
5 polypeptide chains. The heteroJimer Fv has been shown to be active for both 

fluorescein and TAG-72 binding. 

All publications cited herein are incorporated fully into this disclosure 
by reference. 

From the foregoing it will be appreciated that, although specific 
10 embodiments of the invention have been (iescribed herein for purposes of 

illustration, various modificat!^ ns may be made without deviating from the 
spirit and scope of th: invention and the following claims. As examples, the 
steps of the preferred embodiment constitute only one form of carrying out the 
process in which the invention may be embodied. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Whitlov^, Marc 
Wood, James F. 
Hardman, Karl 
Bird, Robert 
Filpula, David 
Rolience, Michele 

(ii) TITLE OF INVENTION: Multivalent Antigen-Binding Proteins 
(iii) NUMBER OF SEQUENCES: 23 

(iv) CORRESPONDENCE ADDRESS: ^ ij«^^4r, c Poxr 

UVJ ^"^^^j^sSEE: Sterne, Keasler. Goldstein & Fox 

(B) STREET: 1225 Connecticut Avenue 

(C) CITY: Was': ii.^ton 

(D) STATE: D.C. 

(E) COUNTRY: U.S.A. 

(F) ZIPr 2003: 

(V) COMPUTER read;- PIS FORM: 

(A) MEDIUM TYPE: Floppy dislc^ 

(B) COMPUTER: IT-M PC compatzcle 

fCV OPERATING SYSTEM: PC-DOS/MS -DOS -,c 
!S SOFTWARE: Tacentln Release #1.0, Version #1.2S 

(vi) CURRENT APPLICATION DATA: . 

(A) APPLICATI'-N* N"J!.1BER: (to be asBigned) 

(B) FILING DA,::;: Herewich 

(C) CLASSIFIC':^C^J: 

"S''a^PLI«tI3n'kSh= us 07/796.93. 
(B) FILING DA:T.: 25-NOV-1991 

(viii) attorney/agen: I-FOrr^TION: ^ 
(A) NAME: Goldntiein, Jcrg- . 

(ix) TELSCOMMmnCATION INFOF-^IATION : 

(A) TELEPHONi:: (202) 633 -7533 

(B) TELEFAX: ::.02) 833 -S7i6 

(2) INFORMATION FOR SEQ :?) HQili 

' (i) SEQUENCE CHAR.^. ■rERXSTICS: 

(A) LENGTH: 1 amino acida 

(B) TYPE: amino acid 

(D) TOPOLOGY : boch 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: It 
Gly Lye Ser Ser Gl;' Ser Gly Ser Clu Ser Lye Ser 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i)- SEQUENCE CHARAC'rr:RISTICS: ^ 

(A) LENGTH: 1-1 -irrdno acics 

(B) TYPE: ami.io acid 
(D) TOPOLOGY: b-.ch 

(Xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Gly Ser Thr Ser Gly Ser Gly Lya S.r Ser Glu Gly Lys. Gly 

5 i ^ 



1 
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(2) INFORMATION FOR SEO ID NO : 3 : 

(i) SEQUENCE CHATACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 



(xi) SEQUENCE DE.S'-RIPT:0N: SEQ ID NO : 3 : 

Gly Ser Thr Ser f'lv Ser Gly Lye Ser Ser Glu Gly Ser Gly Ser Thr 
1 ij ' 10 15 

Lys Gly 



(2) INFORMATION FOR PEO. ID MC : 4 : 

(i) SEQUENCE CHAIWCTSRISTICS: 

(A) LENGTH: 12 arr.Lno acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: both 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Gly Ser Thr Ser Giy Lys Pro Ser Glu Gly Lys Gly 
1 5 ^ 10 

(2) INFORMATION FOR SEj 10 VO:S: 

(i). SEQUENCE CHA:„\CTS?.ISTICS : 

(A) LENGTH: 15 amino acids 
(3) TYPE: a-ino acid 
(D) TOPOLOGY: both 



(xi) SEQUENCE DESCRIPTION^: SEQ ID NO : S : 

Glu Pro Arg Gly Pro Thr He Lvs Pro Cvs Pro Pro Cys Leu Cys 
1 5 'lb 15 

(2) INFORMATION FOR SSQ ID tJO : G : 

(i) SEQUENCE CHArY,CT£KISTlCS : 

(A) LENGTH: 1.5 arr.ino acids 

(B) TYPE: a.T.ino acid 
(D) TOPOLOGY: born 



(xi) SEQUENCE DESC:-.:FTI0N : SEQ ID NO : 6 : 

Ala Glu Pro Lys Ser Cvs Asp Lys Thr His Thr Cys Pro Pro Cys 
1 5 ' 10 15 

(2) INFORMATION FOR S:':Q ID NO : 7 : 

Ci) SEQUENCE CH.M:ACTaRI3TICG : 

(A) LENGTH: -i a::-inc acids 

(B) TYPE: arr.ir-G r.-id 
(D) TOPOLOGV : bcLh 



(xi) SEQUENCE DESCY : PTION : SEQ ID NC : 7 : 

Val Thr Val Ser 
1 
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(2) INFORMATIOI^ FOR SEQ ID NO: 8: 

(i) SEQUENCE C-, -?-.RACTEr- rr>TICS: 

(A) LENGT;:: ll an-.ijio acids 

(B) TYPR : amino cici-d 
■(D) TOPOJ.OGY: both 



(xi) SEQUENCE L::::CRIPTI0K : SEQ ID N0:8: 

Val Thr- Val S. r S-2r Acp Lyc Thr His Thr Cys 
15 10 



(2) INFORMATION FOR SEQ ID KO:9: 

(i) SEQUENCE C'AR/iCTEr.rrriCS: 

(A) LEN-'r..:: :ir:.'....o .-cida 

(B) TYP--:; arriv.o • jid 
(D) TOr^.'-OGi : ho'.A: 



(xi) SEQUENCE rrsCRIPTION: SEQ ID K0:9: 

Val Thr Val r.^r Ser Ar.o Lya Thr His Thr Cys Pro Pro Cys 

1 5 10 

(2) INFORMATION FOR SEQ ID KO:10: 

(i) SEQUENCE CI^^-R,^CTERISTICS: 

(A) LENGTk: 731 bace pairs 

(B) TYPE- r.u:7eif- i--id 

(C) STF^.>. ?T.J^.rS: 'ri-th 

(D) TOI^;::/,CY : bo' ^ 



(ix) FEATURE! 

(A) NAMH/ySY: CDS 

(B) LOC^-T-ic::: 1.-7:9 



8^ 



46 



S6 



(xi) SEQUENCE -DESC^-^IPTI ON: SEQ ID NO: 10 r 

CaC GXC GTT ATG Acr AC?. CCA CTA TCA CTT CCT GTT AGT CTA GGT 

Asp Val Val Met Thr G3 n Thr Pro Leu Ser Leu Pro Val Ser Leu Gly 
1 . ' S 10 

GAT CAA GCC TCC T^C TCr TGC Tr-A TCT AGT GAG AGC CTT GTA CAC AGT 
^ ^ ^ ser li. Sc . C/n A.g Ser Ser Gin Ser Leu Val Hie Ser 
20 25 -30 

MT GGA AAC ACC T.T TTA CGI TGG TAC CTG CAG AAG GGC TCX 144 

Asn Gly Asn Thr T^-: Leu Ar^ Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 ' 40 45 

CCA AAG GTC CTG ATC TAC CTT TCC AAC CGA TTT TCT GGG GTC 

So Lyn Val Leu He Tyr Lyn Val Ser Aea Arg Phe Ser Gly Val Pro 
SO 55 ^0 

GAC AGG rrC AGT GOG AGT GCA TCA GC-G ACA CAT TTC ACA CTC AAG ATC 
Sp Arg IL Ser Glv Sc. Gly Ser Gly Thr Aap Phe Thr Leu Lys lie 
7J 75 80 

AGC AGA GTG G.AG QCi CA ^ GAT C G GGA GTT TAT TTC TGC TCT CAA AGT 288 
S ^ Glu A-' Gi. A -1. I..U Gly val T^.-r Phe Cya Ser Gin Ser 



192 



240 



336 



ACA CAT GTT CCG TCG ACG Trc ■ TT GGA GGC ACC AAG CTT GAA ATC AAA 
^ Ss Val Pro Thr Ph. Gly Gly Gly Th^ Lya Leu Glu He Lya 

100 105 110 

ecr TCT ACC TCT Gf- TCT GGT ■ ^A TCC TCT GAvA GGC AAA GGT CAG GTT 384 
Gly Ser Thr Ser Gly Sor G .v h^, s Ser Ser Glu Gly Lys Gly Gin Val 
115 " 120 125 
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CAG'CTG CAG CAG TCT C^C GCT "^.C TTC GTG vV^A CCT GGG GCT TCA GTG 432 

Gin Leu Gin Gin Ser Arp Ala Glu Leu Val Lye Pro Gly Ala Ser Val 

X30 135 * 140 

AAQ ATT TCC TGC AAG GCT TCr GGC TAG ACC TTC ACT GAC CAT GCA ATT 480 

Lvs lie Ser Cyc Lyo Al.- Sor Gly T>^r Thr Phe Thr Aep His Ala lie 

145 150 • Its 160 

cue TGG GTG AAA CAG T-J.C CCT C.\A CAG GGC CTG GAA TGG ATT GGA TAT 528 

His Trp Val Lys Gin Ann Pro -lu Gin Gly Leu Glu Trp He Gly Tyr 

16S 170 175 

TTT TCT CCC GGA AAT GAT GAT TTT AA/v TAC TJ^T GAG AGG TTC AAG GGC 576 

Phe Ser Pro Gly Asn Asp Ai;p Phe Lyc T>'r Aun Glu Arg Phe Lye Gly 

180 las 190 

AAG GCC ACA CTG ACT C'.TA HAC .\A.A TCC TCC ; 'IC ACT CCC TAC GTG CAG 624 

Lye Ala Thr Leu Thr A .a Ar? I.yo :.er Ser S-r Thr Ala Tyr Val Gin 

195 200 205 

CTC AAC AGC CTG AC7-. T'TT n.' G G/.T TCT GCA GTG TAT TTC TGT ACA AGA 672 

Leu Aen Ser Leu Thr ::--:-r Glu Acp S.-r Ala Val Tyr The Cys Thr Arg 

210 , 215 220 

TCC CTG AAT ATG GCC TAC TGG GGT C.\A GGA ACC TCA GTG ACC GTG TCC 720 

Ser Leu Aen Met Ala Tr? "Vy Gin Gly Ihr Ser Val Thr Val Ser 

225 ' 230 235 240 



TRA TAG GAT CC 
* * Aop 



731 



(2) INFORMATION FOR SEQ ID v:::::!: 

(i) SEOHENCE C-:AJ^.A'rTERJSTTCS: 

{A) LENCni: ?4 3 rvminc a.:i.dG 
(a) TiPT:: ar.ir.c acid 
(D) TC?OL<XJ:: linear 

(ii) MOLECITLK TYPE: protein 

(xi) SEQtJENC::^ L-oSCRIPTI ON: SEQ ID NO : 11 : 

Aep Val Val Met Thr Gin Thr Fro Lei: Ser Leu Pro Val Ser Leu Gly 
lb 10 15 

Asp Gin Ala Ser lie 5;er Cvo Arg Ser Ser Gin Ser Leu Val His Ser 
20 '25 30 

Aan Gly Asn Thr Tvr Leu A:-c Trp Tyr Leu Gin Lye Pro Gly Gin Ser 
35 * ' 45 

Pro Lye Val Leu II..- T-^-r T.^-^; V,a Ser Asn Arg Phe Ser Gly Val Pro 
50 ' 

Aep Arg Phe Ser Gly Ser Giy Ser Gly Thr Asp Phe Thr Leu Lys lie 
65 TO "5 80 

fier Ar^ Val Glu Ma Gin Aop L^su Gly Val Tyr Phe Cyg Ser Gin Ser 

fcS 50 95 

Thr HxB Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys 

100 105 110 

Gly Ser Thr Ser Gly Eer Glv Lve Ser Ser Glu Gly Lye Gly Gin Val 
115 ' 1-0 125 

Gin Leu Gin Gin S-r Aep Ala Glu Lex: Val Lye Pro Gly Ala Ser Val 

140 



13 0 



Lye lie Ser Cys L^e Al^ Sor Gly 'rVr Thr Phe Tlir Asp Hie Ala lie 

145 ' ISO 1^5 160 

His Trp Val Lys Glr Asn Pro Glu Gin Gly Leu Glu Trp Tie Gly Tyr 

ir-, 17 0 1*75 
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Phe Ser Pro Gly Aen A.^o A:ip Phe Lyn Tyr Asn clu Arg Phe Lys Gly 
180 " 185 190 

Lys Ala Thr Leu Thr Ala A.ip Lys Ser Ser Ser Thr Ala Tyr VaL Gin 
^ 195 - 200 205 

Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala VaL Tyr Phe Cye Thr Arg 
210 215 220 

ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser 
225 ' 230 235 2*0 

♦ ♦ Asp 

(2) INFORMATION FOR S^Q ID NO: 12: 

(i) SEQUENCE CHA.RACTERISTICS: 

(A) LENGTH: 744 base pairs 

(B) TYPE: nucleic acid 

(C) STBJ-.irDS-^llESS: both 

(D) TOPO'-OCr: bcch 

(ix) FEATUKS: 

(A) NAME/KZ?: CDS 

(B) LOCATION: 1..744 



55 S; §5S S IS ?& ^ m S ^ S 



48 



(xi) SEQUENCE D'^rSCRIFTION: S2C ID K0:12: 

GAC GTC GTG ATG TC=> CAG TCT CCA TCC TCC CTA. CCT qTG TCA GTT GGC 
Asp Val Val Met Ser G-n 3cr Pro S-.r Ser Leu Pro Val Ser Val Gly 

1 5 1 ^ 

SAG AAG GTT ACT TT" -MC TGC AAG TCC f.GT CAG AGC CTT TTA TAT AGT 96 
vli t5 Leu Ser Cye Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

GGT AAT CAA AAG AAC 7\C TTG GCC TGG TAC CAG CAG AAA CCA GGG CAG 144 
^ ^ AS.-. : -T Lcu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 

35 40 

TCT CCT AAA CTG CT. ATT TAC TGG GCA TCC CCT AGG GAA TCT GGG GTC 192 
Ser Pro Lye Leu Leu I -o T>T Trp Aia ^er /.la ^rg Glu Ser Gly vai. 
SO 55 60 

CCT GAT CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 240 
^ Arg -Phe Thr G.y Ser Gly Se:: Giy Thr Asp Phe Thr Leu Ser 
65 >3 •'b 

ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG 288 
lie ser Ser Val Lys Thr Glu Asp Le. Ala Val Tyr Tyr Cys Gin Gin 

8S 90 

TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAQ CTT GTG CTG 336 
?^ ser T^ Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
^ ' 100 los 110 

ivaft err TCT ACT TC" GC^ AGC GGC AA.\ TCT TCT GAA GGT AAA GGT GAA 384 
^ Gly I" tS Se^ Giy ser Gly Lys Ser Ser Glu Gly Lys Gly GlU 
' ' lis 120 12= 

GTT AAA CTG GAT GAG ACT GGA GGA GGC TTG GTG CAA CCT GGG AGG CCC 432 

^ Su Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro 

130 135 -^-O 

r-rr TCr tg- CTT GCC TCT GGA TTC ACT' TTT AGT GAC TAC TGG 480 

^It ^ Su ser cj. v.JI Ala Ser Gly .he Thr Phe Ser Asp Tyr Trp 
145 '-■-'^ 



528 



^7.5 

CAA ATT AGA AAC AAA CCT TAT AAT TAT GA.A ACA, TAT TAT TCA GAT TCT 57« 
^ Arg Lys Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser 

180 ^ 
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GTG AAA GGC AGA TTC ACC ATC TCA AGA GAT GAT TCC AAA AGT AGT GTC 624 
Val Lys Gly Arg P/va Thr lie Ser Ar-; At;p A-p Ser Lys Ser Ser Val 
195 - 200 205 

TAC CTG CAA ATG A7-.C AAC TTA AGA GTT ^ GAC ATG GGT ATC TAT TAC 672 
Tyr Leu Gin Kec AL^n Asn Leu Arg Val Gh.: Asp Mec Gly lie Tyr Tyr 
210 " 215 220 

TGT ACG GGT TCT TAC TAT GGT ATG GAC TAC TGG GGT CAA GGA ACC TCA 72 0 

Cys Thr Gly Ser Tyr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser 
22S • ' 230 235 240 

GTC ACC GTC TCC TAA TAA GGA TCC 744 
Val Thr Val Ser * / Gly Ser 

2'-. 5 

(2) INFORKATION FOR SEQ ID NO: 13: 

(i") SEOUENC-: CHARACTERISTICS: 

(A) L::NGTH: 248 a."ino acius 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) M0LSCUL2 TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 
1 5 10 IS 

Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gla Ly- Pro Gly Gin 
35 ^0 45 

Ser Pro Lye Leu Leu lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
50 55 eo 

Pro Asp Arg Phe Thr Gly Ser Glv Ser Gly Thr Asp Phe Thr Leu Ser 
65 70 75 80, 

He Ser Ser Val Lvs Thr Glu Asd Leu Ali Val Tyr Tyr Cys Gin Gin 
85 ^ SO • 95 

Tyr Tyr Ser Tyr Pro Leu Thr Phe Gl/ Ala Gly Thr Lys Leu Val Leu 
100 1C5 110 

LVB Gly Ser Thr Ser Gly Ser Glv Lvs Ser Ser Glu Gly Lys Gly Glii 
115 120 12S 

Val Lvs Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro 
130 135 110 

Met Lys Leu Ser C a Val Ala Ser Gly Fh- Thr ?ha Ser Asp Tyr Trp 
145 150 1S5 ISO 

Met Asn Trp Val Arg Gla Ser Pro Glu Lvf^ Gly Leu Glu Trp Val Ala 
lb 5 110 . 175 

Gin He Arg Asa Lys Pro Tyr Asn Tvr Glu Thr Tyr Tyr Ser Asp Ser 
.180 l^^S 150 

Val Lys Gly Arg Phe Thr He Ser Ai^ Aer. Asp Ser Lys S-r Ser Val 
195 200 ' 205 

Tyr Leu Gin Met Aen Asn Leu Arg Val Glu Asp Met Gly He Tyr Tyr 
210 215 220 

CVB Thr Gly Ser Tvr Tyr Gly Met Asp Tyr Trp Gly Gin Gly Thr Ser 
225 ' 230 235 240 

Val Thr Val Ser ' » Gly Ser 

2^5 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CMARACTERISTICS: 

(A) -LENC-:H: 761 base pairs 

(B) TYPE: nucleic acid 

(C) STRA.NT)i^DMESS: both 

(D) TOFOLOGY: both 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..756 



(X^') SEOUENCE rriCP-IPTlON: SEC ID NO: 14: 

^ ;s sn'si? ?q ;s s s e ki ;ii s s =S 

si §S ^2 i iK S ^ K3 S= ill S St SS IS 

m s ?s n: ?^ s s= s is 

s si5 S IJ? m If. I" S S5 - 

S SS S; E S K S ?S S 

65 "^^ 

r'-\ r""^ "AT "^C TGC TCT CAA AGT 
^11 III vll ^Iu ^Iu 5ai ^ ^he Cy3 Se. Cln Ser 

^ -T-^ rr-r r^-^ G'--^ J^f^C AAG CTr GP-A ATC AAA 

5S ;s fs s 

100 1*^^ 

T-r-r TCr- G-V\ GGT A.\A GGT GAA GTT 384 
Z ^ t| IZ f^y try ^ llr ll. Clu Lys c:. OXu Val 



y-r-r^ TT r^rr r*-Ki rC^ GGG AGG CCC ATG 

^ til lil 0.1 Tr ITy 1£ 5:; .o p~ 



»tt «C CCT T»T ..T TJT Jj. AC. Tjt « T=J =jT TCT 

He Arg Asn Lys Fr.. Xyr Asn lyr Glu ih. T/. .yr 
180 



48 



9S 



144 



192 



240 



288 



336 



432 



Lys Leu Asp Glu Thr Gly Gly Gly Leu vcl ^in ^ru 

130 *25 
AAA CTC TCC TGT G.T GCC TCT GGA TIC ACT TTT AGT GAC TAC TGG, ATG 480 

^u ser Cys V.. Ala Ser Gly Pr.e Th. Phe Ser Asp Tyr Trp MeC 
145 '--^ 

«r.^ n^.r p"^ ' C^G GAG TGG G TA GCA CAA 528 

!2 s; s ss gj s s a; sr s,. «. «. 



576 



C'- G^T -^"C .^J^-\ AGT AGT GTC TAG 624 
AAA GGC AGA TTC ACC A:, T> A AG. G. C . 
Lys Gly Arg Phe Thr ^^e Ser ^rg A.p A^p . .r . 

CTG CAA ATG AAC ..C TTA AGA GTT G.. C.C ;.TG GGT ATG T.T TAC TGT 
Leu Gin Met Asa A.a Leu Arg Val Glu ..p ... G^/ X^. y y 

210 215 

-Tr C-C TkC TGG GGT CAA GGA ACC TCG GTC 
ACG GGT TCT TAC '--T ^TG G«C x-k. ]_^^ 

Thr Gly Ser Tyr Ty r Gly Mec Asp Tyr x rp Gly G.a / 
23 0 



720 



225 
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ACC GTC TCC AGT GAT AAG ACC CAT ACA TGC TAA TAGGA7CC 761 
Thr Val Ser S»r Asp Lys Thr His Thr Cy^ » 

(2) INFORMATION SEQ ID NO: 15: 

(i) SE0UENC3 CHARACTERISTICS: 

(A) LSKGTH: 251 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SZQVEV'CE DFSCRI PTI0^7 : TITQ ID NO : 1 5 : 

Asp Val Val Met Thr Gin Thr Pro L-'^u Ser Leu Pro Val Ser Leu Gly 
1 5 :.o 15 

Asp Gin Ala Ser Tie Ser Cys Arg Ser S-ir Gin Ser Leu Val His .Ser 
20 " 25 30 

Asn Gly Asn Thr Tvr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
3 5 4 0* 45 

Pro Lys Val Leu 11^ Tyr Lys Val Ser Asn Arg Phe Ser Gly Val Pro 
50 55 60 

Asp Arg Phe Ser Cly Ser Gly Ser GV/ Thr Asp Phe Thr Leu Lys lie 
65 '70 '75 80 

Ser Arg Val Glu Ala Glu Asp Leu Gly V.vi Tyr Phe Cys Ser Gin Ser 

3 5 SQ 95 . 

Thr His Val Pro Tro Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys 
100 ICi 110 

Gly Ser Thr Ser Gly Ser Gly Lvs Scr S-z-r Glu Gly Lvs Gly Glu Val 
115 120 1:5 

Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro Gly Arg Pro Met 
130 135 IsO 

Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 
145 150 155 160 

Asn Trp Val Arg Glr. Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin 
163 ' 170 175 

lie Arg Asn Lys F:o Tyr Asn Tyr Gli: Thr Tyr T-yr Ser Asp Ser Val 

180 1£5 1^0 

•Lys Gly Arg Phe Thr He Ser Arg Asp Asp Ser Lys Ser Eer Val Tyr 
195 200 205 

Leu Gin Met Asn Asn Leu Arg Val Glu Asp Mec Gly lie Tyr Tyr Cys 
210 215 * 220 

Thr Gly Ser Tyr Tvr Glv Met Asp T> r Trp Gly Gin Gly Thr Ser Val 
225 '23b 2:5 240 

Thr Val Ser Ser Akt Lys Thr His Thr Cyc- 
2^1 2 5 0 

(2) INFORMATION FOR SSQ ID N0:16: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LEN'GTH: 770 base pairs 

(B) TYPE: nucleic acid 

(C) STr^\£.-DEONESS: boch 

(D) TOPOLOGY: both 
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(ix) FEATURE: 

(A) -NA>r/KE;Y: CDS 

(B) LOCATION: 1.-76 5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



S SS S ?£ S K IS - s SII S S 



i 5 1» 



s IS K s ;s IS §?s ?s s SI? s?f ^ 

20 -"^ 

»»n. r-n^ ACC -^'T -^A CGT TGG TAC CTG CAG AAG CCA GGC GAG TCT 

Glv Thr Xrp X-.^r Leu Gin Lys Pro Gly Gin Ser 



ksn Gly Asn Thr Leu Arg Trp IVr i.eu ..n ..^ 

s s £s fji ?s s; s k si s? s s 

sSl s 51? a; ^ i 5n k =3 

S 5tl ?S K IS ?S S =5 i S5 ^ K i S 



100 



S5 s iji is §^ t?; IS IS - Si tfj s?5 s; ?s 

115 

. ^^1. r^r-r- rr" rTr: rx.\ CCT GCG CCC ATG 

s! Ki- S", s "5 i.s ns 



130 



12 D 



;^ CTC TCC TGT OV. GCC TCT CGA TTC ACT TTT ACT GAC TAC TOO ATG 

Lys Leu Ssr Cys V.-- Ala Ser uly Pr- i _-| ■ 

145 

^_ ^.f- ^.1 rrA. CTG GAG TGG GTA GCA CAA 

AAC TGG. GTC. CGC CAG VrV CCA G^G G .A CTG G.G 

Asn Trp Val Arg O 'n S«r Pro Glu L> » C-1/ .. -u 

KTT AGA AAC AAA CCT TAT AA.T TAT C.A^V ACA TAT-TAT TCA CAT TCT GTG 

He Arg Asn Lvd Tz'^ T^/r Asn T/r Glu i/^ ^/^ ^^^^^ . 

130 

AAA GGC AGA TTC AC.C ATC TCA AGA CAT GAT TCC A.VA AGX AGT GTC TAC 
^ Gly Arg Phe Thr He 3=r Arg Aap Asp Ser l,s ..r "yr 



195 200 



CTG CAA ATG AAC AAC TTA AGA CTT CAA GA. ATG COT ATC TAT ^ ^ 
Leu Gin Met Asn A. '. Lsu Arg ^al Clu ^-J ..c. ciy 



210 



ACC GOT TCT TAC TAT CGT ATG CA= TAC TOG C.T CAA GGA ACC TCG GTC 

Thr Gly Ser Tyr Y., Gly MeC ^.sp T.tt : r.j ^ -n Y 

23 0 " . 

AAG ACC CAT ACA TGC CCT CCA TGC TAA TAGGATCC 



225 23 0 



ACC GTC TCC AGT G..: ^-v. -7* , 

Thr val ser Ser 7.ep Lys Thr .is .hr c>3 P.o tr . 



48 



96 



144 



192 



240 



288 



336 



3B4 



432 



480 



528 



576 



624 



672 



720 



770 



2n5 



50 



255 
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(2) INFORMATION r'^R SEQ ID HC : 1 7 i 

(i) seou:;;':e characteristics: 

(A) :.EN3TH: 254 amino .nrids 
(H' TVFE: amino acid 
(D) ^^TOPOLOGY: linear 

(ii) MOLECITE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEC ID NO: 17: 

Asp Val Val Met Thr Gin Thr Pro Leu .S-r Leu Pro Val Ser Leu Gly 
15 10 15 

Asp Gin Ala Ser :\e Ser Cvs Ara 3er Ser Gin Ser Leu Val His Ser 
20 * ' 25 30 

Asn Gly Asn Thr T-/r Leu Arc Tro Tyr Leu Gin Lys Pro Gly Gin Ser 
35 ' ^-0 45 

Pro Lys Val Leu lie Tyr Lya Val Ser Asn Arg Phe Ser Gly Val Pro 
50 55 60 

Asp Arg Phe Ser C v Ser Gly S-r CW Thr Aso Phs Thr Leu Lya lie 
65 70 75 80 

Ser Arg Val Glu .-la Glu Asp Leu Cly Viil Tyr Phe Cys Ser Gin Ser 

Thr His Val Pro T:? Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lye 
100 1-5 110 

Gly Ser Thr Ser G'.v Ser Gly Lvs Ser 5er Glu Gly Lys Gly Glu Val 
lis ' i:o 125 

Lys Leu Asp Glu r:;r Gly Glv Gly Leu Val Gin Pro Gly Arg Pro Met 
130 135 1*10 

Lys Leu Ser Cye V.-.l Ala Ser Gly Fhe Vhr Phe Ser Asp Tyr Trp Met 
145 150 1^5 160 

Asn Trp Val Arg Cln Ser Pro Glu Lys Civ Leu Glu Trp Val Ala Gin 
1£5 170 175 

He Arg Asn Lys 7 ro Tyr Abh T/r Glu Thr Tyr Tyr Ser Asp Ser Val 

180 135 150 

Lys Gly Arg Phe Tr.r He Ser Arc Asp Asn Ser ^Lys Ser Ser Val Tyr 

195 20Q 205 

Leu Gin Met Asn A-n Leu Ara Val Glu Asp Met Gly He Tyr Tyr Cys 
210 215 220 

Thr Gly Ser Tyr T/r Gly Met Asp Tyr Trp Gly Glr. Gly Thr Ser Val 
225 230 255 

Thr Val Ser Ser .-..-d Lye Thr His Thr Cys Pro Pro Cys » 

2 5 2^0 



24 0 



(2) INFORMATION "OR SEQ ID N0:18: 

(i) SEQUENCE CHARACTERIS ."ICS : 

(A) LE.:"TH: 1460 b^tse pairs 
(a) TVPr:: nucleic acrid 

(C) STK.AJi^EDNESS : both 

(D) TOPOLOGY: both 



(ix) FEATURE: 

(A) NAM-/KEY: CDS 

(B) LOCATION: 1. .1398 



(xi) SEQUEr^CE DESCRIPTION: SEQ ID NO : 1 3 : 
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GTT GGC GAG AAG CTT ACT TTG AGC T'C AAG 7ZC ArT CAG AGC CTT TTA 

Val Cly Glu Lyo V*i Thr Leu S.r Cyr Lys S.r Ser C :. . Ser Leu Leu 

TIT UrT GGT AAT CAA A.AG MC TAG TTG GCC TG3 TAC CAG CAG AAA CCA 

ser ITy Gin ^ys Ln Tyr Leu Ala Trp Tyr Gin Gin Lys Pro 
' 27S • 250 

GGg'cAG TCT CCr ;.^A CTG CTG ATT TAC TGG GC.X TCC OCT ACC GAA TCT 

Gly Gin ser Pro Lys Lea Leu He T. Tr? ^la ^er Ala Arg Wu Ser 

290 295 300 



4S 



96 



- 56 - 

GAC GTC GTG ATG TC.V CAG TCT CCA TCC TCC CTA CCT CTG TCA GTT GGC 

Val val Met Se: Gin Ser Pro Ser Ser Leu Fro Val Ser Val Gly 
I " c 10 IS 

GAG AAG GTT ACT TTCJ AGC TGC AAG TCC AGT CAG AGC CTT TTA TAT ACT 
Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 JO 

GGT AAT CAA AAG AAC TAC TTG GCC TGG TAC CAG CAG A-AA CCA GGG CAG 144 
Gly Asn Gin Lys Ar,r. Tyr Leu Ala Tr? Tyr Gin Gin Lye Pro Cly Gin 
• 35 40 -IS 

TCT CCT AAA CTG CTG ATT TAC TGG GCA TC'J GCT AGG GAA TCT GGG GTC 192 
ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 

50 5S 50 

CCT CAT CGC TTC ACA GGC AGT GGA TCT G^G ACA GAT 7TC ACT CTC TCC 
Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp fne Thr Leu Ser 
65 70 '= 



240 



ATC AGC AGT GTG kS-.-Z ACT GAA GAC CT" GCA GTT TAT TAC TG. CAG CAG 288 

He Ser Ser Val Lya Thr Glu Asp Leu Als Val Tyr Tyr Cya Gin Gin 

so 95 

TAT TAT AGC TAT CCC CTC ACG TTC GGT GC: GGG ACC AAG CTT GTG CTG - 336 

Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Lea Val Leu 

100 105 110 

AAA GGC TCT ACT TCC GGT AGC GGC A.A.^ TCC TCT GA_A GGC AAA GGT CAG 384 

^ Cly ser Thr S.r Gly Ser Giy Ly= Ser S = r Gl. Giv Lva Gly Gin 

115 120 12= 

GTT CAG CTG CAC C?..Z TCr GAC GCT GA3 TTC- G" AAA CCT GGG GCT TCA 432 

Val Gin Leu Gin Cir. S^r Aap Ala Glv. L=u vi- Lyo .ro Gly Ala Ser 

130 135 l'-'-^ 

GTG AAG ATT TCC TCC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA 480 

^1 lie ser Cy. Lys Ala Ser Gly Tyr Thr Phe Thr Asp Kxs Ala 
145 -50 

ATT CAC TGG GTG A.:^A CAG AAC CCT GAA CAG GGC CTG G.AA TC-G ATT G^ 52S 

lie His Trp Val Lys Gin Asn Pro Gi^. Gin- Gly Leu Glu .ii) lie Gly 

TAT TTT TCT CCC GGA .^.AT GAT GAT TTT AAA TAC AAT GAG A:iG TTC AAG 576 

?^ ^ £^ Pro Gly Asa Asp Asp Phe Lys Tyr Asn Giu A.g Phe Lys 

1 a 0 1 ' - * 

GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AC. ACr GCC TAC GTG 624 

Gly Lys. Ala Thr Leu Thr Ala Asp Lys Ser Ser Set ih. A. a Tyr Val 
195 200 

C&e CTC AAC AGC CTG ACA TCT 'GAG CAT TCT GCA GTG TAT TTC TGT ACA 

"u Kn ser Leu Thr Ser Glu Asp S.r Ala Val Tyr Phe Cys Thr 
210 215 -^-0 



672 



AGA TCC CTG AAT ATG GCC TAC TGG GGT CAA G. A ACC TCA GTC ACC GTC 720 

Arg Ser Leu Asn. H = t Ala Tyr Trp G:y Gin Giy Inr Ser V»l Thr Val 

TCC TCA GAC GTC CTG ATG TCA CAG T:T CCA T" TCC CTA CCT GTG TCA 768 

Ser Ser Asp Val ^er ^.-n ^ , 



816 



864 



912 
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GGG GTC CCT GAT CGC TTC ACA G:;C AG? GGA TCT GGG ACA GAT TTC ACT 960 

Gly Val Pro Asd Arg ?he Thr Gly Ser Glv Ser Gly Thr Asp Phe Thr 
305 - 310 315 320 

CTC TCC ATC AGC AGT GTG AAG ACT C.^^ GAC CTG GCA GTT TAT TAG TGT 1008 

Leu Ser lie Ser Ser Val Lvs Thr Ciu Asp Leu Ala v^l Tyr Tyr Cys 
•T^C ' 3 3C 335 

GAG GAG TAT TAT A^^C TAT CCC CTC A'' • TTC GGT CCT GGG ACC AAG CTT 1056 

Gin Gin Tyr Tyr S.. r Tyr Pro Leu T:.:' Ph« G:y Ala Gly Thr Lys Leu 
340 ' 3-;5 350 

GTG CTG AAA GGC TCT ACT TCC GGT ACC GGC AA.^ TCC TCT GAA GGC AAA 1104 

Val Leu Lys Gly Bar Thr Ser Gly Srr Gly Lys Ser Ser Ciu Gly Lye 

355 3^0 



GGT CAG GTT GAG CTG GAG CAG TGT GAC GCT G-\3 7TG GTG A>vA CCT GGG 
Gly Gin Val Gin Leu Gin Gin Ser Aco Ala Glu Leu Val Lys Pro Gly 
370 375 3?0 



ACC GTC TCC TAA IaG GAT CC 
Thr Val Ser * ' Asp 

^ ■ 5 



(2) INFORMATION FOR SZQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) :."^;:^TH: 4 a 6 a-an - acids 
(3) "t^Z : amino icid 
(D) : .l OLOGY: lin-^ar 

(ii) MOLECULC TYPE: prct-in 

(xi) SEQUEr*CE'L£SCRIPTION: SEQ ID N0:19: 

Asp Val Val Met S-r Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 
1 3 10 15 

Glu Lvs Val Thr Leu Ser Cys Lvs Sr?r Ser Gin Ser Leu Leu Tyr Ser 



20 



Glv Asn Gin Lys Asn Tvr Leu Ala TrD Tyr Gin Gin Lys Fro Gly Gin 



35 4G 



Ser Pro Lys Leu Lou lie Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 

50 55 60 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Fhe Thr Leu Ser 

65 -?0 "7^ 80 



1152 



GCT TCA GTG AAG ATT TCC TGC A.-\G GCT TCT GGC TAG ACC TTC ACT GAC 1200 

Ala Ser Val Lys He Ser Cys Lvs Ala Ser Gly Tyr Thr Phe Thr Asp 

385 3?5 400 

CAT GCA ATT CAC TCG GTG AAA CAG AAC CCT GA.A CAG CGC CTG GAA TGG 1248 

His Ala He His Tro Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Tip 
4 05 4 10 415 

ATT GGA TAT TTT TC: CCC GGA A.nT G.M GAT TTT A/^A TAG ;.AT GAG AGG 129 6 

lie Gly Tyr Phe Ser Pro Gly Asa Aso Asp Fhe Lys Tyr Asn Glu Arg 
420 ^30 

TTC AAG GGC AAG GCC ACA CTG ACT GCA GAC AX^ TCC TCC AGC ACT GCC 1344 

Phe Lys Gly Lys Ala Thr Leu Thr Ala Ago Lys S-r Ser Ser Thr Ala 

435 A AO ^'.5 

TAG GTG CAG CTC AA.C AGC CTG ACA TCT GAG GAT TCT CCA GTG TAT TTC 1392 

Tyr Val Gin Leu Acn Ser Leu Thr Ssr Glu Asp S-r Al.i V.1I Tyr Phe 

4 5 0 4 55 * t ^ 0 

TGT ACA AGA TCC C7 : M^T ATG GCC T; C TGG GGT CA„\ CCA ACC TCA GTC 144 0 

Cys Thr Arg Ser Le- Aen Met Ala Tyr Tro Gly Gla Gly Thr Ser Val 

465 ^"0 ^75 480 



1460 



wo 93/11161 



PCT/US92/0996S 



-58 - 

He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Ser Cys Gin Gin 



85 



Tyr Tyr ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
100 

Lys Gly Ser Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys Gly Gin 

^ ' 115 120 125 

Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser 
13-0 135 140 

Val Lys He Ser Cj-s Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala 
150 155 J-o" 



145 



He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp lie Gly 



165 



Tyr Phe Ser Pro Gly Asn Asp Asp Ph^ Lys Tyr Asn Glu Arg Phe Lys 
180 183 

Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val 

Cln Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys Thr 
210 215 220 

Arg ser Leu Asn Mec Ala Tyr Trp Gly Gin Gl-/ Thr Ser Val Thr Val 

ser Ser Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser 

245 253 

Val Gly Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu 
260 263 

Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro 
' 275 280 2S5 

Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser 
290 ■ 295 300 

Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr 
305 310 3i3 

Leu Ser lie Ser Ser Val Lys Thr Glu As? Leu Ala Val Tyx Tyr Cys 

Gin Gin Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu 
340 343 

Val Leu Lys Gly Se,r Thr Ser Gly Ser Gly Lys Ser Ser Glu Gly Lys 

355 360 

Gly Gin Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly 
' 370 375 380 

Ala ser Val Lys lie Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp 
385 390 

His Ala He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp 

410 



405 



He Gly Tyr Phe Ser Pro Gly Asn Anp Asp Phe Lys Tyr Asn Glu Arg 
420 425 

Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala 
43.5 ■ 440 ••4-45 

Tyr val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe 

450 455 
cys Thr Arg Ser Leu" Asn Mec Ala Tyr Trp Gly Gin Gly Thr Ser Val 
465 470 4/= 

Thr Val Ser * ' Asp 

485 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) "LENGTH: 725 baae pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDHESS: both 

(D) TOPOLOGY: both 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..723 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT 4 8 

Asp Val Val Met Thr Gin Thr Fro Leu Ser Leu Pro Val Ser Leu Gly 
15 10 15 

GAT CAA GCC TCC ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT 96 
Asp Gin Ala Ser He Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser 
20 25 30 

AAT GGA AAC ACC TAT TTA CCT TGG TAC CTG CAG AAG CCA GGC CAG TCT 144 
Aen Gly Aen Thr Tyr Leu Arg Trp Tyr Leu Gin Lys Pro Gly Gin Ser 
35 40 45 

CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT TCT GGC GTC CCA 192 
Pro Lye Val Leu He Tyr Lys Val Ser Aen Arg Phe Ser Gly Val Pro 
50 55 €0 

GAC AGG TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 24 0 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Aep Phe .Thr Leu Lys He 
65 7 0 7 5 80 

AGC AGA GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT 288 
Ser Arg Val Glu Ala Glu Asp Leu Gly Val Tyr Phe Cys Ser Gin Ser 
85 90 95 

ACA CAT GTT CCG TGG ACG TTC GGT GGA GGC ACC AAG CTT GAA ATC AAA 336 
Thr His Val Pro Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys 
100 105 110 

GGT TCT ACC TCT GGT AA.\ CCA TCT GAA GGC A^^vA GGT CAG GTT CAG CTG 3 84 

Gly Ser Thr Ser Gly Lys Pro Ser Glu Gly Lys Gly Gin Val Gin Leu 
115 120 125 

CAG CAG TCT GAC GCT GAG TIG GTG AAA CCT GCG GCT TCA ^STG AAG ATT 432 
Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser Val Lys He 
X30 135 140 

TCC TGC AAG GCT TCT GGC TAC ACC TTC ACT GAC CAT GCA ATT CAC TGG 48 0 

Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He His Trp 
145 150 155 160 

GTG AAA CAC AAC CCT GA.\ CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT 528 
Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser 
155 170 175 

CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG ACG TTC AAG GGC AAG GCC 576 
Pro Gly Aen Asp Aap Phe Lvs Tyr Asn Glu Arg Phs Lys Gly Lys Ala 
180 185 ISO 

ACA CTG ACT GCA GAC TCC TCC AGC ACT GCC "AC GTG CAG CTC AAC 624 

Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Va\ Gin Leu Asn 
195 200 205 



AGC CTG ACA TCT GAG GAT TCT GCA GTG TAT TTC TGT ACA AGA TCC CTG 
Ser Leu Thr Ser Glu Aen Ser Ala Val Tyr Phe Cys Thr Arg Ser Leu 
210 215 220 



672 
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AAT ATG GCC TAG TGG GGT CAA GGA ACC TCA GTC ACZ GTC TCC TAA TAG 

Met Ala Tyr Trp Oly Gin Oly Thr Ser Val Thr Val S.r * 
225 230 • 

GAT CC 
Asp 



720 



725 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE C-^RACTERISTICS: 

(A) LENGTH: 241 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: procein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ASP Val Val Mec Thr Gin Thr Pro Leu Scr L-.u Pro Val Ser Leu Gly 
X S 10 

Asp Gin Ala Ser lie S^.r O/b ?.rg Ser Scr Gin Ser Leu Val His Ser 
20 25 

Asn Gly Asn Thr Tyr leu Arg Trp Tyr Leu .Gin Lys Pro Gly Gin Ser 

35 ' ;o 

Pro Lys Val Leu He Tvr Lys Val Ser A3n Arg Ph^ Ser Gly Val Pro 

50 ^5 

Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr A.o Phe Thr Leu Lys lie 

SB '0 - 

ser Arg Val Glu Ala Glu Aap Leu Gly Val T'fr Fhe Cys. Ser Gin Ser 
85 30 ^ 

Thr His Val Pro Trp Thr Fhe Cly Gly Gly Thr Lys Leu Glu lie Lys 
105 '■^^ 



xoo 



Gly Ser Thr Ser Gly Lys Pro £.r Giu Gly Lye Gly Gin Val Gin Leu 

115 . 1'" 

Gin Gin Ser Aep Ala Clu L.u Val Lys Fro Gly Ma Ser Val Lys He 
130 

•ser Cy« Lya Ala Ser Giy Tyr Thr Phe Thr A'O K:a Ala He Hie Trp 
145 . 

Val Ly. Gin A-n Pro Glu Gin Cly Leu Glu Trp He Gly Tyr Phe Ser 



165 



Pro Gly A.n A.p A-p ?he Lys T,-r A.n Glu Arg Fhe Lye Gly Ly Ala 
180 185 

Thr Leu Thr Ala A.? Lys S.r Ser Ser Thr ^.la T>-r Val Gin- Leu Aan 
3^95 200 200 

ser Leu Thr Ser Glu A.p S.r Ai a Val Tyr Phe Cy. Thr Arg Ser Leu 
210 

Asn Met Ala Tyr Trp C:v Gin Gly Thr Ser Val Thr Val Ser * 
Asp 

(2) INFORMATION FOR SEQ ID NC : 22 : 

(i) SEQUENCE CHAFACTERISTICS: 

(A) LENGTH: 733 base pairs 

(B) TYPE: nucleic acid 

(C) STRAJsT5}:DMESS: loch 

(D) TOPO:^0::V: both 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1.-738 

(Xi) SEOUENCE_ DE:SCRI?T::^N: SEO id N0:22: 

GAG GTC GTG ATG TCA CAG TCT '^CA TCC TCC CTA CCT GTG TCA GTT GGC 4 8 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 
1 5 10 15 

GAG AAG GTT ACT TTC; AGC TGC AAG TCC AG? CAG AGC CTT TTA TAT AGT 96 
Glu LVB Val Thr Lou Ser Cys Lye Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

GGT AAT CAA AAG ;^.^C TAC TTG GCC TGG TA: CAG CAG AAA CCA GGG CAG 144 
Gly Asn Gin Lys A^a Tvr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 
35 40 45 

TCT CCT AAA CTG CTG ATT T?-C TGG GCA TCC GCT AGG GAA TCT GGG GTC 192 
Ser Pro Lye Leu Leu lie Tyr Ti-p Ala Ser Aia Arg Glu Ser Gly Val 
50 £S 50 

CCT GAT CGC TTC ACA G3C AGT CGA TCT GGG ACA GAT TIC ACT CTC TCC 240 
Pro Asp Arg Phe Thr Gly Ser Gly Ser G:y Thr Asp Phe Thr Leu Ser 
65 70 75 80 

ATC AGC AGT GTG /vAG ACT GA-^ GAC CTG GCA GTT TAT TAC TGT CAG CAG 288 
He Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cye Gin Gin 

65 ?3 »S 

TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT GGG ACC AAG CTT GTG CTG 336 
Tyr Tyr Ser Tyr Pro Leu Thr Phs Gly Ala Gly Thr Lyb Leu Val Leu 
100 105 110 

AAA GGC TCT ACT TCC GGT ' CCA TCT GAA GCT AA/. GGT GAA GTT AAA 384 
Lys Gly Ser Thr Ser Gly L>j Pro Ser Glu Gly Lye Giy Glu Val Lys 

lis -20 I'3 

CTG GAT GAG ACT GGA GGA GGC CTG GT3 C.-.\ CCT GGG AGG CCC ATG AAA 432 
Leu Asp Glu Thr Gly Gly Gly Leu Val Gin. Pro Gly Arg Pro Met Lys 
130 13 5 1-0 

CTC TCC TGT GTT GCC TCT GGA TTC ACT T AGT GAC TAC TGG ATG AAC 480 
Leu Ser Cye Val Ala Ser Gly Phe Thr PLi Ser Asp Tyr Trp Met Asn 
145 150 1^5 160 

TGG GTC CGC CAG TCf CCA GAG AA.\ GGA CTG GAG TGG CTA GCA CAA ATT 528 
Trp Val Arg Gin Ser Pro Glu Lvs Gly Leu Glu Trp Val Ala Gin He 
165 ' 170 175 

AQA AAC AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA CAT TCT GTG AAA 576 
Arg Asn Lys Pro T-j-c Aan Tyr Glu Thr lyr T>'r S^r Aso Ser Val Lys 
ISO 185 190 

GGC AGA TTC ACC ATC TCA ACA OAT GAT T JC AA^ AGT AGT GTC TAC CTG 
Gly Arg Phe Thr He Ser Arg Ae:) Ago Sar Lys Ser Ser Val Tyx Leu 
195 200 205 

CAA ATG AAC AAC ri\A AGA CTT CAX GAC AT^ GGT ATC TAT TAC TGT ACG 672 
Gin Met Asn Asn Leu Arg Val Glu Ab? : Gly lie Tyr Tyr Cys Thr 
210 215 220 



GGT TCT TAC TAT GO ; ATG GAC TAC TGG C.;T CA,^ GGA ACC TCA GTC ACC 
Gly Ser Tyr Tyr Giv K^t Ar > Tyr Trp Gly CAn Gly Thr Ser Val Thr 
225 ' 230 23-. 240 

GTC TCC TAA TV^ g:.A TCC 
Val Ser ♦ • Gly s^r 



624 



720 



738 



(2) INFORMATION FOR SZQ ID NO: 23: 

(i) SEQUENCc CHARACTEAISTICS: 

{A} LENGTH: 2^6 a.-nino acids 
(E) TYPE : amino acid 
(D) TOPCLOGY: linear 



wo 93/1 1161 



PCr/lIS92/09965 



- 62 - 

(ii) MOLECULS r/PE: protein 

(xi) SEQUENCE DSSCRIPTION: SZQ 10 NO:23: 

Asp Val Val Met Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly 

X _ 5 -'^ 

Glu Lys Val Thr L.u S-.r Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 

20 23 

Oly Ash Gin Lv8 Asn Tyr LtM Ala Tr? Tyr Gla Gin Lys Pro Gly Gin 
35 * 40 

Ser Pro Lya Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
SO 5= 60 

Pro Asp Arg Phs Thr Gly S.r Gly S.r Gly Thr Asp Phe Thr Leu Ser 

65 ''0 " 

He ser Ser Val Ly. Thr Glu Asp Leu Al i Val Tyr T>-^ Cya Gin Gin 

c i '-^ 

Tyr Tyr Ser T-t Pro Leu Thr The Gly Ala Gly Thr Ly. Leu Val Leu 
I'OO 105 110 

Lys Gly Ser Thr S.r Gly L:-^ Pro Ser Clu Gly Lys Gly Glu Val Lys 
' ' 120 12S 



IXS 



Leu Asp Glu Thr Gly Gly Gly Leu Vil Cl.i Fro Cly Ar3 Pro Met Lys 



130 135 1*10 



Leu Ser Cys V.l Ala S.r cly Fhe Thr Fh-- S.r A«p T:.-r Trp Met Aen 
145 1'-'^ . 

Trp Val Arg Gin S..r Pro Glu lys Gly u Glu Trp Val Ala Gin lie 



Arg Asn Lys Pro Tyr Asn T>^ 3lu Thr Tyr Tyr Ser A.p S.r Val Lys 

loO ISd 

Gly Arg Phe Thr I>- Ser Ara Asp Asp S.r Lys Ser S.r Val Tyr Leu 

XSS 20 0 " ^C'> 

Gin Met Asn Asn Leu Arg Val Glu Aop K.c Gly He Tyr Tyr Cys. Thr 
210 21S 220 

Gly ser Tyr T:.'r Gly A.p T.-p Gly Gin Gly Thr Ser Val Thr 

Val Ser » * Gly Ser 

2iS 
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What Is Claimed Js: 

A multivalent antigen-binding protein comprising two or more 
single-chain molecules, each single-chain molecule comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain molecule. 

2. The multivalent protein of claim 1 wherein said first polypeptide 
10 comprises the binding portion of the variable region of an antibody light chain, 

and said second pol^'peptide comprises the binding portion of the variable 
region of an antibody heavy chain. 

3. The multivalent protein of c^aim 1 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody light chain, 

15 and said second polypeptide comprises tiie binding portion of the variable 

region of an antibody light chain. 

4. The multivale:.t protein of claim 1 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody heavy 
chain, and said second polypeptide comprises the binding portion of the 

- 20 variable region of an antibody heavy chain. 

5. The multivalent protein of claihis 1, 2, 3, or 4 comprising a 
bivalent antigen-birJing protein. 

6. . The multivalent protein of claim 5 comprising a heterobivalent 
antigen-binding protein. 



5 
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7. The multivalent protein of claim 5 comprising a homobivalent 
antigen-binding proirin. 

8. A composition comprising a multivalent antigen-binding protein 
substantially free of Mngle-chain molecules, wherein said multivalent protein 
comprises two or i^ore single-chain :nolecules, each single-chain molecule 
comprising: 

(?.) a first polypeptide comprising the binding portion of die 
variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 

the variable region of an antibody heavy or light chain; and 

(c) a peptide linker linkirg said first and second polypeptides 

(a) and (b) into said single-chain molecule. 

9. The composition of claim 8 wherein said first polypeptide 
"comprises the binding portion of the variable region of an antibody light chain, 
and said second polypeptide comprises the bin.iirg portion of tfie variable 
region of an antibody heavy chain. 

10. The composition of claim 8 wherein said first polypeptide 
comprises thebin(iing portion of the variable region of an antibody lightchain, 
and said second polypeptide comprises ths binding portion of die variable 
legion of an antibody light chain. 

11. The composition of claim 8 wherein said first polypeptide 
comprises the binding portion of the variable region of an antibody heavy 
chain, and said second polypeptide comprises the binding portion of the 
variable region of an antibody heavy chain. 

12. The composition of claims 8, 9, 10, or 11, comprising a 
bivalent antigen-binding protein snbstantiaHy free of single-chain molecules. 
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13. Tlie composition of claim 12 wherein said bivalent protein is 
heterobivalent. 

14. Tlie composition of claim 12 wherein said bivalent protein is 
homobivalent. 

15. An aoiieous composition comprising an excess of multivalent 
antigen-binding protein over single-chain molecules, said multivalent protein 
comprising two or more single-chain molecules, each single-chain molecule 
comprising: 

(a) £ first polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or ligiit chain; and 

(c) a peptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein. 

16. ■ The aqueous composition of claim 15 wherein at least one of 
said single-chain molecules comprises: 

(a) a first polypepiide comprising the binding portion of the 

variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy chain; and 

(c) a peptide linker linking said fir^t r^nd second polypeptides 
(a) and (b) into said single-chain protein. 

17. The aqueous composiiion of claim 15 wherein at least one of 
said single-chain molecules comprises: 

(a) a first polypeptide comprising the binding portion of the 

variable region of an antibody light chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody light chain; and 
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(c) apeptide linker linking said flrstand second polypeptides 
(a) and (b) into said single-ciiain protein. 

18. The composition of claim 15 wherein at least one of said single- 
chain molecules comprises: 

5 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody heavy chain; 

(b) a second po!>T^eptide comprising the binding portion of 
the variable region of an antibody heavy chain; and 

(c) apeptide linker linking said f rst and second polypeptides 
10 (a) and (b) into said single-chain protein. 

19. A method of producing a multivalent antigen-binding protein, 

comprising the steps of: 

(a) producing 2 composition comprising multivalentantigen- 

binding protein and single-chain molecules, each single-chain molecule 
15 comprising: 

(i) a firstpolypcptide comprising the binding portio^ 
of the variable region of an antibody herivy or light chain; 

(ii) a second polypepdde comprising the binding 
portion of the variable region of an antibody hca^7 or light chain; and 

20 (iii) a peptide linker linking said first and second 

polypeptides (a) and (b) into said single-chain molecule; 

(b) separating said muUivalentprotein from said single-chain 

molecules; and 

(c) recovering saiJ muirivalent protein. 

25 20. Tl)e method of claim 19 wherein separating said multivalent 

protein from said single-chain molecules comprisrs utilizing cation exchange 
chromatography. 
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21. The method of claim 19 wherein separacing said multivalent 
protein from said single-chain molecules comprises utilizing gel filtration 
chromatography. 

22. A method of producing a multivalent antigen-binding protein 
comprising the steps of: 

(a) producing a composition comprising single-chain 
molecules, each single-chain molecule comprising: 

(i) a first polypepiide comprising the binding ponion 
of the variable region of an antibody heavy or light chain; 

(ii) a second polypeptide comprising the binding 
portion of the variable region of an antibody heavy or light chain; and 

(iii) a peptide linker linking said first and second 
polypeptides (a) and (b) into sai(^ single-chain molecule; 

(b) dissociating said single-chr.in molecules; 

(c) re-associating said single-chain molecules; 

(d) separating multivalentantigcn-binding proteins from said 
single-chain molecules; and 

(e) recovering said multivalent proteins. 

23. The method of claim 22 wherein snid dissociation is caused by 
dialysis against a dissociating solution, 

24. The method of claim 22 wherein r.aid reassociation is caused by 
dialysis against a refolding solution or a refolding agent. 

25. A metlr^d of producing a multivalent antigen-binding protein, 
comprising the step of cross-linking at least two single-chain molecules to each 
other, each single-chain molecule comprising; 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy or light chain; 
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(b) a second pol:/'pepLidc comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

(c) apeptidc linker linking said first and second polypeptides, 
(a) and (b) into said single-chain molecule. 

5 26. The mediod of claim 25 \v]i:rein said cross-linking is effected 

by chemical means. 

27. A method of producing a multivalent antigen-binding protein, 

comprising the steps of: 

(a) producii-g a composition comprising single-chain 

10 molecules, each single-chain molecule comprising: 

(i) a first polypeptide comprising ihebindingportion 

of the variable region of an antibody heavy or light chain; 

(ii) a second poh^eptide comprising the binding 
portion of the variable region of an antibooy heavy or light chain; and 

j5 (iii) a peptiJe linker linking said first and second 

polypeptides (a) and (b) into said single-cliain molecule; 

(b) concentrating s;.id sii:gle-chain molecules; 

(c) separating said multivalent protein from said single-chain 

molecules; and 

20 (d) recoverng said muhivalent protein. 

28. The method of claim 27 wherein s;:id concentrating step occurs 
from approximately 0.5 mg/m! single-chain molecule to the concentration at 
which precipitation starts. 

29. A meUiod of detecting an antigen in or suspected of being in a 

25 sample, which comprises: 

(a) contacting said sample w\± the multivalent antigen- 
binding protein of ciaim 1, and 
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(b) detectint^ whciher said multivalent antigen-binding • 

protein has boui^.d to said ;intigen. 

30. A method of imaging the internal structure of an animal, 
comprising administering to said animal an effective amount of a labeled form 
of the multivalent antigen-binding pro:r:in nf claim 1 and measuring detectable 
radiation associated with said anima!. 

31. A composition comprising an association of a multivalent 
antigen-binding protein as claimed in any one of claims 1-4, 8-11, or 15-lS 
with a therapeutically or diagnostically effect i\'e agent. 

32. A single-chain protein comprising: 

(a) a first polypeptide comprising tiie binding portion of the 
variable region of an antibody light chain; 

(b) a second polvq^entide comprising the binding portion of 
the variable region of an antibody light chain; 

(c) apeptide linker linking saiH first and second polypeptides 
(a) and (b) into said single-chain pro:cin. 

33. A single-chain protein comprising: 

(a) a first polypeptide comprising the binding portion of the 
variable region of an antibody heavy chain; 

(b) a second pohT.cptidc comprising the binding portion of 
the variable region of an antibody he^vy chain; 

,(c) apeptide linker linking said first and second polypeptides 
(a) and (b) into said single-chain protein, 

34. A single-chain protein comprising: 

(a) a first polypeptide comprising the or Vh of a CC49 
monoclonal antibody; 
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(b) a second polypeptide comprising the VLOr VHOf aCC49 

monoclonal antibody; and 

(c) a peptide linker linking said first and second polypeptides 

(a) and (b) into said single-chain protein. 

5 35. The single-chain protein of claim 34 wherein said linker is 

selected from the group consisting of the 202', 212, 216. and 217 linkers. 

36. A single-chain protein comprising: 

(a) a first pt.lypeptide comprising the Vl or V„ of a CC49 

monoclonal antibody; 

(b) a second polypepdde comprising the Vl or V„ of . a 4-4- 

20 monoclonal nntibody; and 

(c) apeptide linker linking said firstand second polypeptides 

(a) and (b) into said sinnle-chain protein. 

37. The single-chain protein of clair-, 36 wherein said linker is 
15 selected from the group consisting of the 202', 212. 216, and 217 linkers. 

38. A generic sequence which codes for the sir.gle-chain protein of 

claim 32, comprising: 

(a) a DNA sequence coding for a first: polypeptide 
comprising the binding porticii of U.e vr^riable region of an antibody light 

20 chain; 

(b) a DNA sequence coding for a second polypeptide 
comprising the binding portion of the variable region -of an antibody light 
chain; 

(c) a DNA sequence coding for a peptide linker liridngsaid 
25 first and second polypeptides (a) and (b) into said single-chain protein. 

39. A genetic sequence which codes for the single-chain protein of 
claim 33, comprising: 
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(a) a DNA sequence coding for a first polypeptide 
comprising the binding portion of the variable region of an antibody heavy 
chain; 

(b) a DNA sequence coding for a second polypeptide 
comprising the binding portion of the variable region of an antibody heavy 
chain; 

(c) a DNA sequence coning for a peptide linker linking said 
first and second polypeptides (a) and (b) into said single-chain protein. 

40. A genetic sequence which codes for the single-chain protein of 
claim 34, comprising: 

(a) a DNA sequence coding for the or of a CC49 
monoclonal antibody; 

(b) a DNA sequence coding for the or Vh of a CC49 
monoclonal antibody; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into said single-chain protein. 

41. The genetic sequence of claim 40 wherein said DNA sequence 
codes for a peptide linker selected froiii the group consisting of the 202\ 212, 
216, and 217 linkers. 

42. A genetic sequence which codes for the sinf^Je-chain protein of 

claim 36, comprising: 

(a) a DNA sequence coding for the or Vh of a CC49 

monoclonal antibody; 

(b) a DNA sequence coding for the or Vh of a 4-4-20 

monoclonal antibody; 

(c) a DNA sequence coding for a peptide linker linking said 
first and "^oond polypeptides (a) and (b) into said single-chain protein. 
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43. The genetic sequence of claim 42 wherein said DNA sequence 
codes for a peptide linker selected from the group consisting of the 202% 212, 
216, and 217 linkers. 

44. A multivalent single-chain antigen-binding protein comprising: 
5 (a) a first polypeptide comprising the binding portion of the 

variable region of an antibody heavy or liglit chnin; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; 

(c) a peptide linker li nking said first and second polypeptides 

10 (a) and (b) into said multivalent protein; 

(d) a third polypeptide comprising the bindingportion of the 

variable region of an antibody heavy or light chain; 

(e) a founh polypeptide comprising the binding portion of 
the variable region of an antibody hea^/y or light chain; 

25 (f) apeptide linker linking said third and fourth polypeptides 

(d) and (e) into said multivalent protein; and 

(g) a peptide linker linking said second and third 
polypeptides (b) and (d) into said mulrivalent pro:eia. 

45. A multivalent single-chain antigen-binding protein comprising: 
20 (a) a first polypeptide comprisiLg the binding portion of the 

variable region of an antibody light chain; 

(b) a second polypeptide comprising Ihe binding portion of 

the. variable region of an antibody heavy chnin; 

(c) a peptide linker linking said first and second polypeptides 

25 (a) and (b) into said multivalent protein; 

(d) . a third pol>T)eptidecoi^prising rhe binding portion of the 

variable region of an antibody light chain; 

(e) a fourth polypeptide comprising the binding portion of 
the variable region of an aiitibody heavy chnin; 
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(f) a peptide linker linking said third nnd fourth polypeptides 
(d) and (e) into said multivalent proiein; and 

(g) a pepnde linker linking said second and third 
polypeptides (b) and (d) into said multivalent protein. 

5 46. A genetic sequence which codes foi the multivalent antigen- 

binding protein of claim 44 or 45, con^;v ic;ino; 

(a) a DNA seoncncL^ coding for a first polypeptide 
comprising the binding portion of the vnrial^le region of an antibody heavy or 
light chain; 

10 (b) a DNA sequence coding for a second polypeptide 

comprising the binding portion of the variable region of an antibody heavy or 
light chain; 

(c) a DNA sequence coding for a peptide linker linking said 
first and second polypeptides (a) and (b) into s^id multivalent protein 
15 (d) a DNA sequence cod:*ig for a third polypeptide 

comprising the binding portion of the variable r-gion of an antibody heavy or 
light chain; 

(e) a* DNA sequence coding for a fourth polypeptide 
comprising the binding portion of the variable region of an antibody heavy or 

20 light chain; 

(f) a DNA sequence coding for a peptide linker linking said 
third and fourth polypeptides (d) and (e) into .said multivalent protein; and 

(g) a DNA sequence coding for a peptide linker linking said 
second and third polypeptides (b) and (d) into said multivalent protein. 

25 47. A replicable cloning c; expression vehicle comprising the DNA 

sequence of any one of claims 38-43. 

48. The vehicle of claim 47 which i^ a plasmid. 



49. A hoit cell transformed v. ith the vehicle of claim 47, 
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50. The host cell of claim 49 v/l;ich is a bacterial cell, a yeast cell 
or other fungal cell, or a rna:nmalian cell line. 

51. A method of producing a nultivrJent antigen-binding protein 
comprising two or more single-chain mclectiles, eacl, single-chain molecule 

5 comprising: 

(a) a first polypeptide compriring th? binding portion of the 
variable region of an antibody heavy or liglu chain; 

(b) a second polypeptide comprising the binding portion of 
the variable region of an antibody heavy or light chain; and 

j^O (c) apepti(]e linker Hnkin? said first and second polypeptides 

(a) and (b) into said single-chain molecule, said method comprising: 

(i) providing a geneti- sequence coding for said 

single-chain molecule; 

(ii) transfori-nin'i one ^r more host cells with said 



15 sequence; 
and 

or hosts. 



(iii) expressing said sequence in said host or hosts; 

(iv) recoveririg a multi'/alent protein from said host 



2Q 52. A method of producir.g a muliivalent single-chain antigen- 

binding protein comprising two or more single-'hain molecules, each single- 

chain molecule comprising: 

(a) a first polypeptide comprising t!. ?• binding portion of the 

variable region of an antibody heavy or lig!-t c!':ajn; 
25 (b) a second polypeptide comprising the binding portion of 

the variable region of an antibody hea^o^ or WgM chain; 

(c) apeptide linker linkii.-said firstr.nd second polypeptides 

(a) and (b) into said multivalent protein; 

(d) athird pol>T)eptide comprising thebindingportionof the 

30 variable region of an aiuibody heavy or !ig!:r chain; 
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(e) a foiinh polypeptide comprising the binding portion of 
the. variable region of an antibody heavy nr light ch:!in; 

(f) a peptide linker linking sv.ul third and fourth polypeptides 
(d) and (e) into said multivalent protein; ;ind 

5 (g) a peptide lin!:er linking said second and third 

polypeptides (b) and (d) into i^aid muKiva! *:u proiein, said method comprising: 

(i) providing a p'^n^iic sequence coding for said 
single-chain molecule; 

(ii) transforming one or mere host cells with said 

10 sequence; 

(iii) expressing said sr-quence in said host or hosts; 

and 

(iv) recovering a multivalent protein from said host 

or hosts. 

15 53. The method of claim 51 or 52 wherein recovering said 

multivalent protein comprises separating said multivalent protein from said 
single-chain molecules. 

54. The method of claim 51 or 52 wherein recovering said 
multivalent protein comprises: 
20 (a) dissoci::ting said s'ngle-chain molecules; 

(b) re-asscciating said single-chain molecules; 

(c) separating multivalent antigen-binding proteins from said 

V 

single-chain molecules; and ' 

(d) recoverng said multivalent proteins. 

25 55. ' The method of claim 51 or 52 which further comprises 

purifying said recovered multivalent protein. 



56. The method of claim 51 or 52 wherein said host cell is a 
bacterial cell, a yeast cell or other fnngal cell, or a mammaljan cell line. 
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57. The me.thod of claim 36 w'nerein said host cell is E. coli or 
Badllus subtilis. 

58. The multivalent antigen-binding prctein of claim 1 indetectably- 
labelled form. 

59. In an iminuiioassay method which utilizes an antibody in 
detectably-labelled form, the improvemeni comprising using the multivalent 
protein of claim 58 instead of said antibody. 

60. The immunoassay of claim 59 wherein said immunoassay is a 
competitive immunoassay. 

61. The immunoasF-'.y of c! -im .^9 wlierein said immunoassay is a 
sandwich immunoassay. 

62. In an immunotherapeutic method wTiich utilizes an antibody 
conjugated to a therapeutic agent, the improvement comprising using the 
multivalent protein of claim 1 instead of said antibody. 



15 



63. In a method of immunoaffin'.ty purification which utilizes an 
antibody therefor, the improvement wliich comprises using the molecule of 
claim 1 instead of said antibody. 
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4-4-20 VL/212/CC49 Vh gene 

4-4-20 Vl 10 20 

Asp Val Val Met Thr Gin Thr Pro Leu Ser Leu Pro Vol Ser Leu Gly Asp Gin Ala Ser 

GAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT GAT CAA GCC TCC 
Aat II 

30 40 

lie Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 

ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT AAT GGA AAC ACC TAT TTA GGT TGG 

50 60 

Tyr Leu Gin Lys Pro Gly Gin Ser Pro Lys Vol Leu He Tyr Lys Val Ser Asn Arg Phe 

TAC CTG CAG AAG CCA GGC CAG TCT CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT 

70 80 

Ser Gly Val Pro Asp Arg Phe Ser Gly Ser Gly S?r Gly Thr Asp Phe Thr Leu Lys He 

TCT GGG GTC CCA GAC A6G TTC AGT GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 

90 100 

Ser Arg Val Glu Ala Glu Asp Leu Gly VqI Tyr Phe Cys Ser Gin Ser Thr His Vol Pro 

AGC AGA GTG GAG GCT GAG GAT CTG GGA GTT TAT TTC TGC TCT CAA AGT ACA CAT GTT CCG 

110 212 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys Gly Ser Thr Ser Gly Ser Gly Lys 

TGG AC6 TTC GGT GGA GGC ACC AAG CTT GAA ATC AlA GGT TCT ACC TCT GGT TCT GGT AAA 

Hind III 

CC49 Vh 130 HO 

Ser Ser Glu Gly Lys Gly Gin Vol Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro 

TCC TCT GAA GGC AAA GGT CAG GTT CAG CTG CAG CAG TCT GAC GCT GAG TTG GTG AAA CCT 

Pvull Psil 

150 160 

Gly Ala Ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He 

GGG GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT CCC TAC ACC TTC ACT GAC CAT GCA ATT 

170 180 

His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser Pro Gly 

CAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT CCC GGA 

FIG. 1 0A 
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4-4-20 Vl/212/CC49 gene 

190 200 

Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys 

AAT GAT GAT TTT AAA TAG AAT GAG AGG TIC AAG GGC AAG GCC ACA CTG ACT 6CA GAC AAA 

210 220 
Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser AU Val Tyr 
TCC TCC AGC ACT GCC TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA GTG TAT 

230 240 
TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC 
Phe Cys Thr Arg Ser Leu Asn Met Ala Tyr Trp Giy Gin Gly Thr Ser Val Thr Val Ser 



xxx X3» Asp 

TAA TA G GAT CC 

Ban HI 



F1G.10A(CONT.) 
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CC49 Vl/212/4-4-20 Vh gene 

CC49 Vl 10 20 

Asp Val Val M Ser Gin Sen Pro Ser Ser Leu Pro Vol Ser Val 6ly Glu Lys Val Thr 

GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GIG TCA GTT CGC GAG AAG GTT ACT 
Aa^ II 

30 40 

Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 

TTG AGC TGC AAG TCC AGT CAG AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 



50 60 

Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lvs Leu Leu lie Tyr Trp Ala Ser Ala Arg 

TGG TAC CAG CAG AAA CCA GGG C;\G TCT CCT AAA CTG CTG AIT TAC TGG GCA TCC GCT AGG 

70 80 

Glu Ser Gly Val Pro Asd Arg Phe Thr Gly S9r Gly Ser Gly Thr Asp Phe Thr Leu Ser 

GAA TCT GGG GTC CCT GAT CGC TIC ACA GGC ,-GT CGA TCT GGG ACA GAT TTC ACT CTC TCC 

90 100 

He Ser Ser Val Lys Thr Glu Asp Leu Ah YqI Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr 

ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GiT TAT TAC TGT CAG CAG TAT TAT AGC TAT 

no 212 Linker 120 

Pro Leu Thr Phe Gly Alo Gly Thr Lys Leu Vol Leu Lys Gly Ser Thr Ser Gly Ser Gly 

CCC CTC ACG TTC GGT GCT GGG ACC AAG_LTT -TG CTG AAA GGC TCT ACT TCC GGT AGC GGC 

Hind Ili 
4-4-20 Vh 

Lys Ser Ser Glu Gly Lys Gly Glu Vol Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin 

AAA TCT TCT GAA GGT AAA GGT GAA GTT AAA CTG CAT GAG ACT GGA GGA GGC TTG GTG CAA 

150 160 

Pro Gly Arg Pro Met Lys Leu Ser Cys Val AU Ser Gly Phe Thr Phe Ser Asp Tyr Trp 

CCT GGG AGG CCC ATG AAA CTC TCC ;GT GTT G"C iCT GGA Ttc ACT TTT AGT GAC TAC TGG 

170 180 

M Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin lie Arg Asn 

ATG AAC TGG GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA ATT AGA AAC 

FIG. 1 OB 
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CC49 Vl/212/4-4-20 gene 

190 200 

Lvs Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Ser Vq( Lys Gly Arg Phe Thr He Ser 

AAA CCT TAT AAT TAT GAA ACA TAT TAT TCA GM TCT GIG AAA GGC AGA TTC ACC ATC TCA 

210 220 
Arg Asp Asp Ser Lys Ser Ser Val Tyr Leu Gin Me-t Asn Asn Leu Arg Val Glu Asp Hei 
AGA GAT GAT TCC AAA AGT AGT GTC TAG CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG 

230 240 
Glv He Tvr Tyr Cys Thr Gly Ser Tyr T^r G'v M Asp Tyr Trp Gly Gin Gly Thr Ser 
GGT ATC TAT TAC TGT ACG GGT TCT TAG TAT GGi AIG GAC iAC TGG GGT CAA GGA ACC TCA 



Val Thr VqI Ser 3^ * Gly Ser 
GTC ACC GTC TCC TAA TAA GGA T CC 

Ban HI 



FIG.lOBfCONT. 
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4-4-20/212 pro-tein with single cys'pirie hinge 

4-4-20 Vl 10 

Asp VqI Val Het Thr Gin Thr Pro Leu Ser leu Pro Val Ser Leu Gly Asp Gin Ala Ser 
GAC GTC GTT ATG ACT CAG ACA CiA CTA TCA CTT CCI GTT AG! CTA G6T GAT CAA GCC TCC 
AQi II 

He Ser Cys Arg Ser Ser Gin Leu Vol His Scr Asn Gly Asn Thr Tyr Leu Arg Trp 
ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA TAC AHT AAi GGA AAC ACC TAT TTA CGT TGG 

50 60 
Tyr Leu Gin Lys Pro Gly Gin Ser Pro Lys Val Leu lie Ty- Lys Val Ser Asn Arg Phe 
TAC CTG CAG AAG CCA GGC CAG iCT CCA AAG GIC Cm AIC TAC AAA GTT TCC AAC CGA TTT 

70 80 
Ser Gly Val Pro Asp Arg Phe Ser Gly Ser Gly r,pr Gly Thr Asp Phe Thr Leu Lys He 
TCT GGG GTC CCA GAC AQG TTC J GGC AG! ^:GA ICA GGG ACA GAT TTC ACA CTC AAG ATC 

90 100 
Ser Arg Val Glu Ala Glu Asp Lpu Gly VqI ^vr Phe Cys Se- Gin Ser Thr His Val Pro 
AGC AGA GT6 GAG GCT GAG GAT CiG GGA GTT taT T'C TGC TCT CAA AGT ACA CAT GTT CCG 

HO 212 Linker 120 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys G|v Ser T hr Ser Gly Ser Gly Lys 

TGG AC6 TTC GGT GGA GGC ACC A'G_CTJ GAA AIC A/A GGT TCT ACC TCT GGT TCT GGT AAA 

Hind 111 

4-4-20 Vh '40 

Ser Ser Glu Gly Lys Gly Glu Vo.l Lys Leu Asp Glu Thr Gly Gly Gly Leu Val Gin Pro 

TCT TCT GAA GGT AAA GGT GAA GTT AAA CTG GAT GAG ACT GGA GGA GGC TTG 6TG CAA CCT 

150 160 
Glv Arq Pro M Lys Leu Ser Cvs Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Het 
GGG AGG CCC ATG AAA CTC TCC TGT GTT GCC TCT GGA TTC ACT TTT AGT GAC TAC TGG ATG 

170 180 
Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Vcl Ala Gin He Arg Asn Lys 

AAC TGG GTC CGC CAG TCT CCA G.,G AAA GGA CTG G,\G TGG GTA GCA CAA AH AGA AAC AAA 

F1G.15A 
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4-4-E0/212 proiein with single cys-teine hinge 

190 200 
Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr He Ser Arg 
CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA GGC AGA TTC ACC ATC TCA AGA 

210 220 
Asp Asp Ser Lys Ser Ser Val Tvr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Me-t Gly 
GAT GAT TCC AAA AGT AGT GTC TiC CTG CAA AIG AAC AAC TTA AGA GTT GAA GAC ATG G6T 

P30 240 
He Tyr Tyr Cys Thr Gly Ser W Tyr Gly M Asp Tyr Trp Gly Gin Gly Thr Ser Val 
ATC TAT TAG TGT ACG GGT TCT TAG TAT GGT AlG GAC TAG IG5 CGT CAA GGA ACC TCGGTC 

Bs-t EII 

Hinge 250 
Thr VqI Ser Ser Asp Lys Thr H is Thrjys 
ACC GTC TCC AGT GAT AAG ACC '. AT ACA TGC lAA TAG GAT CC 

Bon HI 

pGx 5532, Gx 8932 
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4-4-20/212 pro-tein with "two cys-telne hir.ge 
4-4-20 Yl 10 

Asp Vol Val He-t Thr Gin Thr Pro Leu Ser Leu Pro Val Ser Leu Gly Asp Gin Ala Ser 
GAC GTC GTT ATG ACT CAG ACA CCA CTA TCA CTT CCT GTT AGT CTA GGT GAT CAA GCC TCC 

30 40 
He Ser Cys Arg Ser Ser Gin Ser Leu Val His Ser Asn Gly Asn Thr Tyr Leu Arg Trp 
ATC TCT TGC AGA TCT AGT CAG AGC CTT GTA CAC AGT AAT GGA AAC ACC TAT TTA C6T TGG 

50 60 
Tyr Leu Gin Lys Pro Gly Gin Scr Pro Lys Vol Leu He Tyr Lys Val Ser Asn Arg Phe 
TAG CTG CAG AAG CCA G6C CAG :CT CCA AAG GTC CTG ATC TAC AAA GTT TCC AAC CGA TTT 

70 80 
Ser Gly Val Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Lys He 
TCT GGG GTC CCA GAC AGG TTC /'.]T GGC AGT GGA TCA GGG ACA GAT TTC ACA CTC AAG ATC 

90 100 
Ser Aro Val Glu Ala Glu Asp l"u Gly Val Tyr Phe Cys Ser Gin Ser Thr His Vol Pro 
AGC AGA GTG GAG GCT GAG GAT uG GGA GTT TAT TIC TGC TCT CAA AGT ACA CAT GTT CCG 

HO 212 Linker 120' 

Trp Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys Glv Se r Thr Ser Gly Ser Gly Lys 

TGG ACG TTC GGT GGA GGC ACC f.\G_r[T GAA ATC AAA GGT TCT ACC TCT GGT TCT GGT AAA 

Hind III 

4-4-cO Vh 130 140 

Se r Ser Glu Gly Lys G ly Glu Val Lys Leu Asp Glu Thr Gly Gly Gly Leu Vol Gin Pro 

TCT TGI GAA GGT AAA GGT GAA G^T AAA CTG GAT C/,G ACT GGA GGA GGC TTG GTG CAA CCT 

150 160 
Gly Aro Pro Met Lys Leu Ser Cys Val Ala Ser Gly Phe Thr Phe Ser Asp Tyr Trp Met 

GGG AGG CCC ATG AAA CTC TCC TGT GTT GCC TCT GGA TIC ACT TTT AGT GAC TAC TGG ATG 

170 180 
Asn Trp Val Arg Gin Ser Pro Glu Lys Gly Leu Glu Trp Val Ala Gin He Arg Asn Lys 

AAC TGG GTC CGC CAG TCT CCA GAG AAA GGA CTG GAG TGG GTA GCA CAA ATT AGA AAC AAA 



FIG.15B 
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4-4-20/212 protein with two cysteine hinge 

190 200 
Pro Tyr Asn Tyr Glu Thr Tyr Tyr Ser Asp Ser Val Lys Gly Arg Phe Thr lie Ser Arg 
CCT TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GTG AAA GGC AGA TTC ACC ATC TCA AGA 

210 220 
Asp Asp Ser Lys Ser Ser Val Tyr Leu Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly 
GAT GAT TCe AAA ACT AGT GTC TAC CTG CAA ATG AAC AAC TTA AGA GTT GAA 6AC ATG 6GT 

c?9 240 
He Tyr Tyr Cys Thr Gly Ser T r Tyr Glv H"t Asp Tyr Trp Gly Gin Gly Thr Ser Val 
ATC TAT TAC TGT ACG GGT TCT ] t lAT GG l A^G GAC TAC TGG GGT CAA GGA ACC TCG GTC 

Bst Ell 

Hinge 250 
Thr Val Ser Ser Asp Lys Thr l - s Thr Cys Pro Pro Cys m 
ACC GTC TCC AGT GAT AAG ACC l.:T ACA IGC CCT CCA TGC TAA TAG GAT CC 

Ban HI 

pGx 5533, Gx 8933 
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CC49/212 SCA™ protein geneiic diner 

CC49 Vl 10 2" 

Asp Val Val Het Ser Gin Ser Pro Ser Ser Leu Pro Vol Ser VqI Gly 6lu Lys Va( Thr 

GAC GTC GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA GTT GGC GAG AAG GTT ACT 
Aat II 

30 40 

Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 

TTG AGC TGC AAG TCC AGT CAG AGC CTT TTA lAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 

50 60 
Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu" He Tyr Trp Ala Ser Ala Arg 
TGG TAC CAG CAG AAA CCA GGG C/G TCT CCT AAA CI] CTG AFT TAC TGG GCA TCC GCT AGG 

70 80 
Glu Ser Gly Val Pro A^p Arg P:- Thr Gly Ser G(v Ser Gly Thr Asp Phe Thr Leu Ser 
GAA TCT GGG GTC CCT GAT CGC IK ACA GGC rGT GGA TCT GGG ACA GAT TTC ACT CTC TCC 

90 100 
He Ser Ser Val Lys Thr Glu A-n Leu Ala Vol Tv^^ Tyr Cys Gin Gin Tyr Tyr Ser Tyr 
ATC AGC AGT GTG AAG ACT GAA G C CTG. GCA GIT TAT TAC TGT CAG CAG TAT TAT AGC TAT 

110 212 Linker 120 

Pro Leu Thr Phe Gly Ala Gly Trr Lys Leu Vol Leu Lys Gly Ser Thr Ser Gly Ser Gly 

CCC CTC ACG TTC GGT GCT GGG a:[ AAG CTT Gm CiG AAA GGC TCT ACT TCC GGT AGC GGC 

Hind III 
CC49 Vh 

Lys Ser Ser Glu Gly Lys Gl y G'n Vol Gin Leu G'n Gin Ser Asp Ala Glu Leu Val Lys 

AAA TCC TCT GAA GGC AAA GGT LAG GIT CAljCi Aj'^G CAG TCT GAC GCT GAG TTG GTG AAA 

Pvull Psii 

150 160 

Pro Gly Ala Ser Val Lys He St Cys Lys AU Ser Gly Tyr Thr Phe Thr Asp His Ala 

CCT GGG GCT TCA GTG AAG ATT T .C TGC AAG G:T TCT GGC TAC ACC TTC ACT GAC CAT GCA 

170 180 
He His Trp Val Lys Gin Asn P-o Glu Gla GIv Leu Glu Trp He Gly Tyr Phe Ser Pro 
An CAC TGG GTG AAA CAG AAC L.T r:VA Cf 'l G'-C CTG GAA TGG ATT GGA TAT TTT TCT CCC 

FIG.16A 
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CC49/212 SCA™pro-tein genetic diner 

190 200 
Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Fhe Lys Gly Lys Ala Thr Leu Thr Ala Asp 
GGA AAT GAT GAT TTT AAA TAC A^T GAG AGG TIC AAG GGC AAG GCC ACA CTG ACT GCA GAC 

?!0 220 
Lys Ser Ser Ser Thr Ala Tyr V:l Gin Lpu A^n Leu Thr Ser Glu Asp Ser Ala Val 
AAA TCC TCC AGC ACT GCC TAC GiG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT GCA GTG 

230 240 
Tvr Phe Cys Thr Arq Ser Leu Am M Ali T-r T.p Gly Gin Gly Thr Ser Val Thr Val 
TAT TTC TGT ACA AGA TCC Ci3 A: : ATG GCJ TAC TGG GGT CAA GGA ACC TCA GTC ACC GTC 

CC49 Vl 

Ser Ser Asp Val Vol H^t Ser ''In Ser fro Ser S:r Leu Pro Vol Ser Val Gly Glu Lys 

TCC TCA GAC GTC GTG ATG TCA i ■G iCT CCA TCC VZ HA CCT GiG TCA GTT GGC GAG AAG 
Aai II 

270 280 

Val Thr Leu Ser Cys Lys Ser '^-r G!n Sep Leu l-u Tyr Ser Gly Asn Gin Lys Asn Tyr 

Gn ACT TTQ AGC TGC AAG TCC f.M CAG AGC CTI liA TAT AGT GGT AAT CAA AAG AAC TAC 

c90 300 
Ley Ala Trp Tyr Gin Gin Lvs F-o Gly Gin C?r P:-o Lys Leu Leu He Tyr Trp Ala Ser 
TIG GCC TGG TAC CAG CAG A^'A .CA GGl CAG ICT CCT AAA CTG CTG ATT TAC TGG GCA TCC 

310 320 
Ala Atq Glu Ser Gly Vol Pro (-.d Vg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr 
GCT AGG GAA TCT GGG GTC CCT c 't CGC TTC ACA [ CC AGT GGA TCT GGG ACA GAT TTC ACT 

330 340 
Leu 5er lie Ser Ser Val Lys Thr Glu Asp Leu AIq Val Tyr Tyr Cys Gin Gin Tyr Tyr 
CTC TCC ATC AGC AGT GTG AAG -CT GAA GAC CTG GCA GTT TAT TAC TGT CAG CAG TAT TAT 

350 212 Linker 360 

Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lvs leu Val Leu Lys Gly Ser Thr Ser Gly 

AGC TAT CCC CTC ACG TTC GGT -CT GGG ACC AAG ''T G'G CTG AAA GGC TCT ACT TCC GGT 

riind ill 

CC49 Vh 380 
Ser Gly Lys Se r Ser Glu Gly '^'s Gly Gin Vol Gin Leu Gin Gin Ser Asp Ala Glu Leu 
AGC GGC AAA TCC TCT GAA GGC'^CA GGT CAG GTT r;;G_CT^CAG CAG TCT GAC GCT GAG TTG 

Kvuli Psii 

F1G.16B 
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CC49/21H SCA™ protein genetic diner 

390 400 

Val Lys Pro Gly AU Ser Vol Lys He S?: Z\3 Ip AU Ser Gly Tyr Thr Phe Thr Asp 

GTG AAA CCT GGG GCT TCA GTG AAG ATT TCC TC'J AAG GCT TCT GGC TAG ACC TTC ACT GAC 

410 420 
His Ala He His Trp Val Lys Gin A.-n Pro GUj Gin Gly Leu Glu Trp He Gly Tyr Phe 
CAT GCA ATT CAC TGG GTG AAA CAG AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT 

430 440 
Ser Pro Gly Asn Asp Asp Phe Lvs Tvr An Glu Arg Phe Lys Gly Lys Ala Thr Leu Thr 
TCT CCC GGA AAT GAT GAT TFT AAA TAG AAI G. G frJ. TTC AAG GGC AAG GCC ACA CTG ACT 

450 460 
Ala Asp Lys Ser Ser Ser Thr Ah Tyr Val G!n Leu Asn Ser Leu Thr Ser Glu Asp Ser 
GCA GAC AAA TCC TCC AGC ACT S.C T.VC GTG CAl CiC AAC AGC CTG ACA TCT GAG GAT TCT 

. 470 480 
Ala Vttl Tyr Phe Cys Thr Arg Sfr Leu Asn Met t-.h Tyr Trp Gly Gin Gly Thr Ser Val 
GCA GTG TAT TTC TGT ACA AGA 111 CTG AAf A^G GCC TAC TGG GGT CAA GGA ACC TCA GTC 



Thr Val Ser x*« Asp 

ACC GTC TCC TAA TAG GAT CC 

Ban HI 

f"ir^ i or 
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4-4-20 VL/217/CC49 Vh Qsns 

4-4-20 Yl 10 

Asp VqI Val Met Thr Gin Thr Pro L?u Ser leu Pro Vol ^er Leu Gly Asp Gin Ala Ser 

GAC GTC GTT ATG ACT CAG ACA CCA CfA TCA C]i lC I Gil AGT CTA GGT GAT CAA GCC TCC 

30 40 

lie Ser Cys Arg Ser Ser Gin S-r Leu Vol I'ls ■ - A;n Gly Asn Thr Tyr Leu Arg Trp 

ATC TCT TGC AGA TCT AGT CAG ACC CTT GTA CAC a:^[ AAT GGA AAC ACC TAT UA C6T TG6 

50 60 

Tyr Leu Gin Lys Pro Gly Gin S?r Pro Lvs Vol L?:j Hp Tyr Lys Val Ser Asn Arg Phe 

TAG CTG CAG AAG CCA GGC CAG TJ' ClA A'.3 G": AIC IAC AAA GTT TCC AAC CGA TTT 

70 80 

Ser Gly Vat Pro Asp Arg Phe S-r ^'y Ser r'v Trr 01 v Thr Asp Phe Thr Leu Lys He 

TCT GGG GTC CCA GAC AGG TTC A:r l :C AGT .GA v G:] i CA GAT TTC ACA GTC AAG ATC 

90 100 

Ser Arg Val Glu Ala Glu Asd L"'^ Gly VqI Ty- P^? Cvs Se- Gin Ser Thr His Vol Pro 

AGC AGA GTG GAG GCT GAG CAT C oLA GTi T/ ' ^: :GC TCT CAA AGT ACA CAT. GTT CCG 

110 217 Ll.-iker 120 

Trp Thr Phe Gly Gly Gly Thr b- \ -u Glu 1!? L - Glv S er Thr Ser Gly Lys Pro Ser 

TGG ACQ TTC GGT GGA GGC ACC ATi . 1_ GAA A': AAA ■£] TCT ACC TCT GGT AAA CCA TCT 

Hi^.a ill 

CC49 Vh 130 HO 

Glu Gly Lys Gly Gin Vol Gin L^'u Gin Gin ^c- A:p Ala Glu Leu Val Lys Pro Gly Ala 

GAA GGC AAA GGT CAGJTlJrG C: ; .AG CAG : ^ AC GCT GAG TTG GTG AAA CCT GGG GCT 

Pvui: bii 

150 160 

Ser Val Lys He Ser Cys Lys A!^: : ?r Gly T r I' - Phe Thr Asp His Ala He His Trp 

TCA GTG AAG ATT TCC TGC AAG G , GGC T- ^GC GTC ACT GAC CAT GCA ATT CAC TGG 

170 180 

Val Lys Gin Asn Pro Glu Gin Gly L^u Glu T-o He Gly Tyr Phe Ser Pro Gly Asn Asp 

GTG AAA CAG AAC CCT GAA CAG GGG ^iG GAA iii ; M GGA TAT TTT TCT CCC GGA AAT GAT 

FiG.19A 
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4-4-20 VL/217/CC49 gene 

190 200 
ASD Phe Us Tyr Asn Glu Arg F'- I ys Gly Lys A;a Thr Leu Thr Ala Asp Lys Ser Ser 
GAT m AAA TAC A,M GAG AGO ilC AAG GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC 

m 220 
Ser Thr Ala Tyr VqI Gin Leu A-n Ser Leu- T'- Glu hsp Ser Ala Val Tyr Phe Cys 
A6C ACT GCC JAC GTG CAG CTC AAC AGC CTG A[ ; J GAG GAT TCT GCA GTG TAT TTC TGT 

240 

Thr Aro Ser Leu A^n Hei Ala ■ - >d Glv Gh ^Iv Thr Ser Val Thr Val Ser 

ACA AGA TCC CTG AAT ATG GCC ^ -G G.T CAA -A ACC TCA GTC ACC GTC TCC TAA TAG 



Asp 
GAT CC 
Ban HI 
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CC49 Vl/217/4-4-£0 gene 
CC49 Vl 

Asp Val Val Met Ser Gin Ser Pro 
GAC GTC GTG ftTG TCA CAG TCT C- ^- 
Aat II 

Leu Ser Cys Lys Ser Ser Gin S:=r 
TTG AGC TGC AAG TCC AGT CAG A/: 



Trp Tyr Gin Gin Lys Pro Gly G'n 

TGG TAC CAG CAG AAA CCA GGG C,'"G 



6lu Ser Gly Vol Pro Asp Arg Fr 
GAA TCT GGG GTC CCT GAT CGC T ' 



He Ser Ser Vat Lys Thr Glu A:n 
ATC AGC AGT GTG AAG ACT G/' A 



Pro Leu Thr Phe Gly Ala Gly Tr.r 

CCC CTC ACG TTC GGT GCT GGG A ' : 

4-4-20 \\ ] 
Ser Glu Gly Lys Gly Glu Vcl i -s 
TCT GAA GGT AAA CGf GAA /■. n 



Arg Pro Hei Lys Leu Ser Cvs V-i 1 

AGG CCC ATG AAA CTC TCC TG: G i 



Trp Val Arg Gin Ser Pro Glu !.ys 
TGG GTC CGC CAG ICT CCA GAG i A 



10 

Ser Ser Leu Pro 
ICC TCC CIA U:l 

30 

Leu Leu Tyr ^^r 
CTT TTA TAT A ':T 

GO 

r.n^ Pro ' -^s Lp'j 
•GT CCT A'A r.G 



20 

Val Ser Val Gly Glu Lys Val Thr 
GTG TCA GTT GGC GAG AAG GTT ACT 

40 

Gly Asn Gin Lys Asn Tyr Leu Ala 
GGT AAT CAA AAG AAC TAC TTG GCC 

60 

Leu II? Tyr Trp Ala Ser Ala Arg 
GTG ATT TAC TGG GCA TCC GCT AGG 



71 

"ihr Glv ^:'r G'.'-' 

;:a g^c .:oT g:a 

l?u A if: "ol Tyr 

GIG l:a jt t't 
nc 

l.vs Leu Vn I L°u 

■ ■^G p.] ^"'G G" ■ 
Aind ill 
130 

Leu Asp Glu Thr 

:TG X;.G ACT 

is: 

Ala Ser Gly PAe 

GCG TCT G^'A T:C 



80 

Sar Glv Thr Asp Phe Thr Leu Ser 
TCT GGG ACA GAT TTC ACT CTC TCC 

100 

Tvr Cvs Gin Gin Tyr Tyr Ser Tyr 
TAC TGT CAG CAG TAT TAT AGC TAT 

217 Linker 120 
Lys Gl y Ser Thr Ser Gly Lys Pro 

AAA GGG TCT ACT TCC GGT AAA CCA 

140 

Gly Glv Gly Leu Val Gin Pro Gly 
GGA GGA GGC TTG GTG CAA CCT GGG 

160 

Thr Phe Ser Asp Tyr Trp Met Asn 
ACT ITT AGT GAC TAC TGG ATG AAC 



170 

Gly Leu Glu T-p 
^GA CIG GAG 



IBO 

Vcl Ala Gin He Arg Asn Lys Pro 
GTA GGA CAA ATT AGA AAC AAA CCT 
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CC49 Vl/217/4-4-20 gene 200 

Tyr Asn Tyr Giu Thr Tyr Tyr l?r Asp Ser Val Lvs Gly Arg Phe Thr He Ser Arg Asp 

TAT AAT TAT GAA ACA TAT TAT TCA GAT TCT GIG A '.A GGC AGA TTC ACC ATC TCA AGA GAT 

210 220 

Asp Ser Lys Ser Ser Val Tyr L-j Gin Met Asn Asn Leu Arg Val Glu Asp Met Gly He 

GAT TCG AAA AGT AGT GTC TAC CTG CAA ATG AAC AAC TTA AGA GTT GAA GAC ATG GGT ATC 

279 240 

Tyr Tyr Cys Thr Gly Ser Tyr Tvr 'lly Hei Asn T-r Trp Gly Gin Gly Thr Ser Val Thr 

TAT TAC TGT ACG GGT TCT TAC lA: C-GT ATG GAC TAC TGG GGT CAA G6A ACC TCA GTC ACC 



VqI Ser m Gly Ser 
GTC TCC TAA TAA GGA TCC 
Ban HI 
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PROCESSING RLE: PolyCaiA/Proc.CC-I^Prep 

METHOD: PREP POLY CAT A#2 

INJECT VOL 44 

SAMPUNG IWT: 0.3 SECONDS 

CHROMATOGRAM: 



ID 

to ^' 




ANALYSIS: 


CHAfiNEL A 




PEAK NO. 


TIME 




1 


17.090 


:il 


2 


18.940 


N2 


3 


21.775 


lis 


4 


30.100 


lA 


5 


33.455 


fio 


6 


38.940 


NB 


7 


42.010 


:! 7 
i . / 


8 


44.540 


' ' 1 
1 i 


9 


57.055 


! . ; 


10 


57.610 


;.;o 


11 


58.240 


XI 1 


TOTAL AREA 







FIG 



HC-H ■(,;'/) AREAOiV-SEC) AREA% 

m 348239 0.778 

80 U 669441 1.496 

10-^1 8617252 19.263 

74.5 9753616 21.804 

106: "4 15749605 35.208 

172l6 2833701 6.334 

12' V. 1637917 3.661 

9.57 1968584 4.400 

- '7 2012338 4.498 

9-2^ 210914 0.471 

C 24 930855 2.080 

44732462 99,993 
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